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ABSTRACT 


Schaefer and Strong have proposed a new class of digital computer 
components which would perform two-dimensional array logic operations 
(tse logic) on binary data arrays. This dissertation is concerned vrith 
the further development of tse logic concepts through the design of Golay 
transform processing machines that utilize tse logic. 

The basic tse components that are currently under development 
at NASA's Goddard Space Flight Center are described. The properties 
of , Golay transforms v;hich make them useful in image processing are 
reviewed, and several architectures for Golay transform processors are 
presented with emphasis on the skeletonizing algorithm. 

A hardwired skeletonizing machine is designed using basic tse 
components. An output disable control line is shown to be an extremely 
useful addition to active tse logic devices. Two additional hardwired 
skeletonizing machines are developed using tse logic devices with 
an output disable control. Several alternate techniques are illustrated 
for performing the critical index recognition operation, and new tse 
logic devices are introduced. A unique pipeline architecture is 
developed for performing ultrahigh speed image processing. In addition, 
a programmable tse computer capable of per forming numerous Golay trans- 
forms is designed. Programs are written for performing both the skel- 
etonizing and swelling algorithms. 

Conventional logic control units are developed for the Golay 
transform processors. One is a unique microprogrammable control unit 
that uses a microprocessor to control the tse computer. The remaining 


control units are based on programmable logic arrays. 

Performance criteria are established and utilized to compare 
the various Go! ay transform machines. On the basis of this research 
a critique of tse logic is presented, and directions for additional 
research are identified. 
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CHAPTER 1 


INTRODUCTION 

A primary goal of computer architecture research is to increase the 
speed and efficiency of computing machines. The performance boundary 
for a particular machine organization is imposed by the basic physical 
limitations of the available hardware. Therefore, increased computing 
power must ultimately be obtained through improvements in computer organ- 
ization. Realizable computer architectures, however, are constrained by 
the availability of suitable hardware. Thus, optimum computer system 
development requires that component design and system architecture be 
considered in concert. This principle is the basis for the general 
goals of the research reported in this dissertation. 

Schaefer and Strong [1] have proposed a new class of digital com- 
puter components which would perform two-dimensional array logic opera- 
tions (tse logic) on binary data arrays. The goal of this research 
effort is to support NASA's development of the tse logic concept through 
the design of a two-dimensional parallel computer using hypothetical tse 
logic devices. By developing tse computer system architecture concepts 


Numbers in brackets refer to those entries listed in the list of 
references . 

Tse is the English transliteration of the Chinese word for a 
pictograph character. 
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concurrently with the physical hardware, trade offs between tse component 
complexity and system design constraints can be optimized. For example, 
useful additions to the tse logic family have been proposed as a result 
of this research. 

Historical Background 

The concept of a two dimensional parallel computer was utilized by 
Unger [2, 3] as a method for improving the performance of digital com- 
puters in spatially oriented problems such as pattern detection and 
recognition. Unger's proposed spatial or SPAC computer consists of a 
master control unit and a rectangular array of logical modules (Figure 
1,1). Each module contains a one-bit accumulator, several one-bit mem- 
ory registers, and some logical circuitry. Modules coiraiunicate directly 
with their four immediate neighbors. In addition, there is an external 
input in the form of a photocell. Thus, with the exception of the con- 
trol unit, a module has all the features of a rudimentary conventional 
computer. 

Unger's SPAC computer utiltizes global control in which identical 
commands are issued to all logical modules in the array. The master 
control unit includes a random-access memory for instruction storage, a 
program counter, and appropriate instruction decoding circuitry. Opera- 
tion is similar to that of a conventional digital computer control unit 
except that the control signals are distributed to each module in the 
rectangular array, rather than to a single arithmetic-logic unit. The 
global control feature of SPAC limits use of the machine to problems 
involving applied parallelism. 


MASTER 

CONTROL 













Unger recognized the importance of connectivity in spatial computer 
operations. As a result, each SPAC module is connected to its eight 
nearest neighbor modules through special link circuits (Figure 1.2). 

When a link instruction is executed, a storage register in each link is 
set if the accumulators of the two logical modules to which the link is 
connected both contain a one. The storage register is cleared if the 
accumulator of one, or both, of the modules contains a zero. The state 
of each link storage register remains fixed until another link instruc- 
tion is executed. 

. Expand instructions are used in conjunction with the link operation- 
to perform tasks involving connectivity. For example, a horizontal 
expand instruction causes a one to be placed in the accumulator of each 
module which is connected through a horizontal chain of set link elements 
to tv/o modules with ones in their accumulators. An expansion may be per- 
formed with respect to any combination of the four available link orien- 
tations: horizontal, vertical, positive diagonal, and negative diagonal. 

The real significance of the expand operation is that the contents of a 
module can be directly affected by the contents of arbitrarily distant 
modules. 

Although Unger's computer was impractical in terms of hardv/are cost, 
the architecture of SPAC was simulated on an IBM 704 general purpose com- 
puter. In this form, the machine v/as successfully applied to the recogni- 
tion of alphanumeric characters and the detection of L-shaped patterns. 

In 1962 Slotnick [4] proposed a more flexible parallel processing 
computer called SOLOMON. Like Unger's machine, the SOLOMON computer, 
shown in Figure 1.3, consists of a large rectangular array of processing 





















n 





Figure 1,3 Organization of Solomon. 
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elements (similar to the logical modules in SPAC) and a central, global 
control unit. There are, however, three basic differences betv/een 
SOLOMON and SPAC. First, the SOLOMON computer introduces limited local 
control of each processing element by inhibiting execution of the current 
instruction wherever the mode of a particular processing element does not 
match the mode specified by the instruction. A processing element may 
be in one of four modes as determined by internal conditions and stored 
data. Thus, although identical instructions are broadcast to each proc- 
essing element, individual conditional jumps can be programmed [5]. 

• The second basic difference between SOLOMON and SPAC is the inter- 
communication pattern of the processing element array. Unger's design 
is based on a simple rectangular array with communication between a 
module and its four nearest neighbors. The SOLOMON computer retains 
this basic intercommunication pattern but with five possible modifica- 
tions. These are as follows; 

1. A vertical cylinder formed by establishing communication between 
the outside columns of the array. 

2. A horizontal cylinder formed by establishing communication 
between the outside rows of the array. 

3. A torus formed by combining the first two options. 

4. A single straight line formed by a connection of all the proc- 
essing elements. 

5. A circular array formed by connecting the end processing 
elements of the straight line. 

A third distinction between the organizations of SOLOMON and SPAC 
is the link structure included in SPAC. No equivalent structure was 
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provided in the SOLOMON computer since SOLOMON v/as designed primarily 
for numeric processing. 

The SOLOMON design was based on a 32 x 32 array of processing ele- 
ments, A machine of this size v/as never constructed due to the exces- 
sive hardware costj hov/ever, a 10 x 10 array was built [6]. Research 
using this machine led to the design of a SOLOMON II computer with a 
faster clock rate, a faster multiply time, and a 24-bit word length [7]. 
Further studies and the advent of medium scale integrated circuits 
encouraged the development of the ILLIAC IV computer which is the larg- 
est parallel 7 array computer now in existence [8], 

The arithmetic-logic unit of the ILLIAC IV computer consists of 256 
processing elements arranged in four reconfigurable SOLOMON type arrays 
of 64 processors each (Figure 1.4). A separate global control unit is 
provided for each array so that the machine can be operated as four 
independent quadrants, two 128 element arrays, or one 256 element array. 
Overall system control is provided by a Burroughs B-6500 computer. 

4 

Each processing element in the array requires 10 emitter-coupled- 

C 

logic gates to execute 4 x 10 instructions per second. In addition, 
there are 2048 sixty- four bit words of memory within each processing 

Q 

element. A one giga-bit disk v/ith a transfer rate of 10 bits per sec- 
ond is provided for mass data storage. Because of economic considera- 
tions, only one quadrant of the four quadrant ILLIAC IV system was 
originally constructed. Additional details of the ILLIAC IV design and 
the applications research which has been undertaken using this machine 
can be found in the engineering literature [5-9], 

In 1969 Golay proposed a two-dimensional computer to perform hex- 
agonal parallel pattern transformations using the hexagonal module array 


B-6500 
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Figure 1.4 Organization of ILLIAC IV. 
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shown in Figure 1.5 rather than the traditional square module array [10]. 
The hexagonal tessellation simplifies connectivity dependent operations 
since each point in the array has six equally distant neighbors rather 
than four primary neighbors at the distance unity and four secondary 
neighbors at the distance •fz' as in the square module array. In addi- 
tion, the Golay transformations specify division of the array into sub- 
fields of non-neighboring modules (Figure 1.6) so that no two neighboring 
modules need be operated on simultaneously as a function of each other. 

The basic computational unit in Golay's proposed computer is the 
submodule which corresponds to a module in Unger's SPAC computer. Each 
submodule represents the state of one point in a two-dimensional binary 
image, and the entire image is represented by a planar, hexagonal layer 
of submodules. Normally, the machine consists of a number of layers 
which overlay each other to form a hexagonal array of modules. All of 
the submodules within one module are Interconnected and, furthermore, 
each submodule of a particular layer, k, communicates directly with its 
six nearest neighbor submodules within layer k. 

The Golay transform classifies the 64 possible patterns of the six 
element surround of a submodule into the 14 characteristic indices shown 
in Table 1.1. Submodule operations can be a function of the subfield 
select signal, the state (1 or 0) of the submodule, the index of the sub- 
module's surround, or the central control unit commands. Often, the 
modular operations are repeated until no further change occurs in the 
layer, k, on which the operations are being performed. This condition is 
detected by forming the modulo 2 sum of the current and the immediately 
preceding iterations in an auxiliary layer, x, and then determining 
whether or not layer x is empty. 
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Figure 1.5 Arrangement of Golay's proposed hexagonal module array. 
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TABLE 1.1 


GOLAY NEIGHBORHOODS 


0 0 11111 
00101011111111 
Pattern -j- + + + + + + 

00000000010111 
0 0 0 0 0 1 1 

Index 0 1 2 3 4 5 6 

Weight 1 6 6 5 6 6 lE = 32 


0 1 1 1 0 0.1 
11101011001001 
Pattern + + + -i* + + + 

000 1 0 000 11 01 1 0 
10 110 0 1 

Index 7 8 9 10 11 12 13 

Weight 2 6 6 6 6 3 3s = 32 
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Although the two-dimensional computer proposed by Golay was never 
constructed, a special purpose computer system capable of performing 
simple Golay transforms on a three layer, 128 x 128 array was imple- 
mented [11], This Golay logic processor (GLOPR) v/as used successfully 
to distinguish between two types of white blood cells and to perform 
other pattern recognition tasks. 

Kruse [12] proposed a parallel picture processing machine with 
many features similar to those of the Golay logic processor but utiliz- 
ing a rectangular module array. Each module within the array is a 
synchronous sequential circuit (Figure 1.7) which has a state transition 
function that is dependent on the present states of that module, the 
eight nearest neighbors of that module, and a set of global control 
signals. The number of possible states which a module may possess is 
limited by design economy si nee, for even a small number of states, the 
number of possible neighborhood patterns becomes large. 

Each neighborhood pattern is called a template. State transitions 
occur when the neighborhood of a module matches one of the templates 
specified in a particular instruction. Kruse allov/s for rotational ly 
syiranetric and iterative operations similar to those used in the Golay 
transform. "Don't care" states within the neighborhood pattern and 
limited arithmetical operations are also proposed. The potential prob- 
lem of conflicting state tMnsitions is avoided by forcing the template 
matching operation to proceed in a specified order. Only the state 
transition indicated by the first matching template is allowed to occur. 

Two essentially serial, information extracting operations are 
employed by Kruse's parallel picture processing machine. The first is 
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a neighborhood counting operation in which the number of occurrences of 
a specific neighborhood is computed during each local operation. This 
information is used for texture analysis and area measurements. The 
second is a coordinate extraction operation which identifies the coor- 
dinate of the first module encountered in a predetermined scanning 
sequence whose neighborhood matches a specified template. This process 
can be used to locate particular features within a picture. 

Since all eight nearest neighbors as well as multiple state tran- 
sitions are utilized in Kruse's machine, the logical modules must be 
quite complex. Therefore, Kruse did not attempt to present a truly 
parallel implementation of the design. Instead, a special purpose serial 
machine capable of performing one local operation at a time was con- 
structed (Figure 1.8). This machine simulates the parallel picture 
processing operation by sequentially matching templates to each neigh- 
borhood in the image. A conventional computer provides system control. 

One of the most recent parallel image processing machines, CLIP 3, 
was described by Stamopoulos [13]. CLIP 3 contains an iterative, 16 x 12 
array of logic cells. As shown in Figure 1.9, each cell includes a sum- 
mation and threshold device, an OR gate, and a function generator, A 
global control unit provides three control lines that select one of eight 
threshold levels and eight control lines that select the neighbor inputs 
which are to be summed. Either a square or hexagonal tessellation can 
be selected under program control. 

The' output of the threshold unit is ORed with the contents of the 
B storage register and applied to one input of the function operator. 
Storage register A provides the second function generator input. Eight 
control lines select the Boolean operations which are performed on the 

















PARALLEL SERIAL SERIAL PARALLEL 

OUTPUT OUTPUT INPUT INPUT 


Figure 1.9 Organization of a cell of CLIP 3. 
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Inputs to produce two outputs, N and D, The N output,- which represents 
the state of the cell, 1s supplied to the threshold units of neighboring 
cells. 

The CLIP 3 processor has been constructed using TTL logic and MOS 
memory at a cost of approximately $10,000. Serial scanning and display 
devices are used for economic reasons. The A and B storage devices are 
shift registers whose contents can be displayed on a dual beam oscillo- 
scope, Pattern Inputs are supplied by means of a light pen. Several 
examples of successful software developed using CLIP 3 are given In [13]. 
A major limitation of this machine Is the relatively small number of 
cells In the array. 


Future Requirements 

The two-dimensional parallel computers described In the previous 
section are potentially orders of magnitude faster than conventional com- 
puters In Image processing and matrix manipulation tasks. Nevertheless, 
even more powerful machines will be required to process the 50,000 Images 
per day that NASA expects to receive from earth observation spacecraft 
during the 1980's [1]. In fact, other important tasks such as real-time 
modeling of the weather through studies of ocean currents and atmospheric 
conditions are already awaiting the advent of sufficiently pov/erful par- 
allel processing machines [14]. 

In order to provide adequate speed and resolution, future machines 
may require arrays containing as many as 1024 x 1024 elements. Construc- 
tion of these machines using hardware organizations such as those des- 
cribed in this chapter would be a formidable task. Although Individual 
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modules with sufficient capabilities could probably be produced as single 
integrated circuits (e,g., microprocessors), the problem of interconnect- 
ing a million or more components to form the complete array would remain. 
The development of inherently parallel tse logic devices as proposed by 
Schaefer and Strong [1] is an attempt to make large two-dimensional par- 
allel computers a reality. 

A summary of the basic tse logic devices proposed by Schaefer and 
Strong [1] is presented in Chapter 2, The neighborhood considerations 
which directed this investigation toward implementation of the Golay 
transform skeletonizing algorithm are discussed in Chapter 3. Chapters 
4 and 5 present basic tse logic implementations of the skeletonizing 
algorithm. Several new tse logic devices and a cost function are pro- 
posed. A hardwired, conventional logic control unit for the tse logic 
processor is developed. In Chapter 6, a high-speed implementation of 
the skeletonizing algorithm, based on a unique application of the pipe- 
line principle, is discussed. A programmable tse computer organization 
which can perform Golay transform algorithms is developed in Chapter 7. 

A microprocessor based control philosophy is proposed. Chapter 8 sum- 
marizes the significant results of this research and suggests directions 
for future investigations. 


CHAPTER 2 


BASIC TSE LOGIC DEVICES AND CONCEPTS 

Previously, researchers have proposed two-dimensional parallel 
computer architectures based on planar arrays of logical modules. Each 
module is essentially a highly specialized microprocessor which, to 
obtain maximum processing speed, must have a different organization for 
each unique computer architecture. The tse logic devices proposed by 
Schaefer and Strong [1] represent an alternate technique for construct- 
ing large, two-dimensional parallel computers. This chapter summarizes 
the work of Schaefer and Strong as reported in [1]. 

Tse Logic 

A tse is a two-dimensional, rectangular matrix of binary data. Tse 
devices execute logic operations simultaneously on all sets of binary 
data from corresponding matrix positions within a group of input tses. 

The primitive operations AND, OR, NEGATE, and SLIDE demonstrated in 
Figure 2.1 form a functionally complete tse logic set. The AND, OR, and 
NEGATE operations represent the standard Boolean functions as applied 
to tses rather than bits, whereas the SLIDE operation translates a binary 
image an integer number of matrix positions in the + x and/or ± y direc- 
tion. Through the four basic SLIDE operations (right, left, up, and 
down), tse logic provides a unique method of transferring data from any 
tse matrix position (i,j) to any other matrix position (m,n). Any 
Boolean function of arbitrarily selected tse data points can be generated 
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NOT A 


SLIDE B DOWN 


Figure 2,1 An example of primitive tse operations on binary images 
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using the primitive tse operations. Therefore, any digital computer can 
be constructed from a standard s .t of tse logic devices which perform 
only elementary operations. The simplicity of the computational cells 
within. each tse logic device is expected to permit their successful inte- 
gration into the required rectangular array. 

Electro-Optical Tse Logic 

Schaefer and Strong [1] proposed an implementation of the tse logic 
concept that is based on a combination of electronic and fiber optic 
technologies. The AND, OR, and NEGATE functions, as well as a special 
REFORMAT operation, are performed by active semiconductor devices. SLIDE 
operations and general data transfers are accomplished using passive fiber 
optic devices. 

Active tse logic devices consist of an integrated array of computa- 
tional cells. Each cell contains photo-detectors which convert the 
optical input into electrical signals. The electrical signals are pro- 
cessed by conventional MOSFET circuits and converted to an optical output 
by either exciting or failing to excite an electroluminescent material. 
Thus, the state of each data point in a tse is indicated by the presence 
(state 1) or absence (state 0) of light at the corresponding array 
position. 

Photon coupling between tse logic devices is provided by coherent 
fiber optic bundles which act as tse image data paths. Optical inputs 
to individual cells of multiple tse input devices, such as the AND and 
OR gates, are projected onto the integrated circuit by a fiber optic 
interleaver. The glass fibers which make up the interleaver are arranged 


so that data points from corresponding matrix positions of two input 
tses are brought into adjacency at the output, thereby becoming inputs 
to a single cell in the active tse logic device. Initially, the com- 
plexity of the interleaver fiber optic array will limit the number of 
tse inputs to two. The basic tse logic devices are illustrated in 
Figure 2.2, and schematic symbols for tse devices are listed in Appendix A. 

The problem of distributing light to multiple fiber optic bundles 

restricts the basic fan-out of each tse logic device to one, Hov/ever, 

the interleaver can be used in reverse as an image duplicator. Half of 

the light output of each electroluminescent point in the array appears . 

at each duplicator output. The two modes of operation for an interleaver 

are illustrated in Figure 2.3, and a prototype interleaver is shov/n in 

Figure 2.4. Because of the arrangement of the glass fibers v/ithin the 

interleaver and the decreased light intensity at each output, an active 

Integrated circuit refomatter is normally required at each output. The 

reformatter serves as a buffer and restores the proper signal levels. 

Therefore, the effective fan-out of a tse logic device can be increased 

to t\w by. adding a duplicator (interleaver) and tv/o reformatters to the 

basic device. Further increases in fan-out can be obtained by cascading 

additional duplicators and reformatters. As shovm by Schaefer and Strong 

[1], the reformatters can be replaced by negators when the complement of 

the input tse is required. This approach reduces propagation delay and 

the number of components required in some tse circuits but also reduces 

the potential noise immunity of the negator device since the logic one 

input threshold must be set to less than one-half of the normal logic 

one light intensity. The basic tse logic circuit for the EXCLUSIVE-OR 

of two tses is illustrated in Figure 2.5. M'PRODUCIBILITY OF THx 
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Figure 2.2 Basic tse logic devices 
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DUPLICATOR 


Figure 2.3 Two modes of interleaver operation 






INPUT INPUT 

IMAGE A IMAGE B 



EXCLUSIVE OR 
OF A AND B 


Figure 2.5 Exploded view of a tse EXCLUSIVE-OR circuit. 

(Courtesy of Earth Observation Systems Division, Goddard 
Space Flight Center). 
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SLIDE operations are accomplished fay transferring the tse from one 
fiber optic image path to another path which is offset in the x and/or y 
direction. Normally fibers in the output path which receive no inputs 
from the original tse are masked in order to guarantee a logical zero in 
the corresponding position of the output tse. Figure 2.6 demonstrates 
the image path configuration required for a SLIDE down operation. By 
utilizing the proper offset, a SLIDE gate can be constructed so as to 
translate a tse any number of matrix positions. 

SLIDE gates are one of several intercommunication devices proposed 
by 'Schaefer and Strong [1]. The other intercommunication devices being • 
considered for use in tse computers include; a cycler, a rotater, a 
vertical inverter, a horizontal inverter, a diagonal inverter, a hori- 
zontal sweeper, a vertical sweeper, a contractor, a spiller, a magnifier, 
and a demagnifier. In the context of this study, the most important 
'Special intercommunication devices are the contractor and the spiller. 

A contractor has one tse input and one optical output element. The out- 
put signal is a logic one if any elements of the input tse are in the 
logic one state. This device is useful for detecting an empty or all 
zero tse which corresponds to a black image. Two potential realizations 
of the contractor are illustrated in Figure 2.7. The spiller device 
shown in Figure 2.8 has one tse input and one tse output. The output 
tse is a white or all logic one image if and only if the element of the 
first row and first column of the input tse is a logic one. Spiller 
operation can be obtained from the tse circuit shown in the lower section 
of Figure 2.7 if all elements of the input image except element (1,1) 
are masked. 
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Performance specifications for the electro-optical tse logic com- 
ponents will be an important consideration in the development of tse 
computer architectures. The experimental devices currently being devel- 
oped are based on a 128 x 128 element array. Initially, active tse logic 
devices will have a response time (propagation delay) of approximately 
five milliseconds and will consume up to three watts of power. Passive 
devices will add significant volume and v/eight to the tse computer 
structure. For example, the prototype interleaver (Figure 2.4, page 27) 
weighs 0.7 grams, is six centimeters long, and has an output image area 
of one square inch. Major objectives in the development of tse logic 
devices vnll be to increase the array size to 1024 x 1024 elements while 
reducing the component size, power consumption, and propagation delay. 

A major advantage of the tse logic concept is that these improvements 
can be incorporated into individual active tse logic devices as the 
electro-optical technology improves. 

Tse Analog-to-Digital Conversic < 

A major application of tse logic will be processing the multiple 
gray level images received by earth observation spacecraft. Before these 
images can be processed, they must be digitized into a set of binary 
tses. Figure 2,9 demonstrates the concept of digitizing an image into 
three tses by using threshold devices. The output tse data v/ord consists 
of three ordered tses and represents an eight level quantization of the 
original image. A six tse data word derived from a 1024 x 1024 pixel 
image would contain over six million bits of data representing 64 gray 
levels. Figure 2.10 illustrates the tse logic concept as applied to pro- 
cessing earth resources data. 


INPUT 

IMAGE 


HALF SILVERED 



MOST SIGNIFICANT 
TSE 


LEAST SIGNIFICANT 
TSE 


Figure 2.9 Tse analog-to-dig1tal conversion hardware 
schematic for an eight level image. 

(Courtesy of Earth Observation Systems Division, 
Goddard Space Flight Center) 








Figure 2.10 Tse computer concept. 

(Courtesy of Earth Observation Systems Division, Goddard Space Flight Center) 



CHAPTER 3 


60LAY HEXAGONAL PARALLEL PATTERN TRANSFORtWIONS 

Many important pattern detection and recognition algorithms are 
based on nearest neighbor logic in which each point in a binary image is 
treated as the basis point of a localized neighborhood. These algorithms 
utilize image transformations which require various operations to be per- 
formed on the basis point as a function of the states of the basis point 
and the points v/ithin its neighborhood. The extent and form of the 
neighborhood must be carefully matched to the application so that hard- 
ware complexity can be minimized. In the first section of this chapter, 
some factors which influence the choice of a neighborhood are discussed. 
The second section reviews a particular type of pattern recognition 
algorithm which utilizes nearest neighbor logic and is the basis of the 
tse logic designs presented in the remainder of this dissertation. 

Neighborhood Considerations 

As pointed out by Golay [10], there are only three types of planar, 
symmetric, isotropic point arrangements: the square, hexagonal, and 

triangular arrays. In a standard, uniformly-shaped rectangular digitiza- 
tion pattern, each basis point has four primary neighbors at the distance 
unity, four secondary neighbors at the distance and additional 

neighbors at distances of two and greater. The number of points which 
can be considered in the basic neighborhood is limited by the practical 
considerations of hardware complexity and cost. Therefore, the simple 
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eight point neighborhood shown in Figure 3,1 is normally selected for the 
rectangular array. As shown by Kruse [12], more complex neighborhood 
operations can be simulated by selected sequences of local operations 
using the basic neighborhood. 

The rectangular neighborhood is advantageous in pattern recognition 
algorithms which depend on an orthogonal coordinate system. In general, ' 

however, the rectangular neighborhood requires more complex processing 
than the hexagonal neighborhood [10]. As shown by Golay [10], any oper- 
ation on a basis point, P, which requires knowledge of the connectivity 
properties of the rectangular neighborhood of P should be a function of . 
both the four primary and the four secondary neighbors of P, Hov/ever, 
since the secondary neighbors are at a somewhat greater distance from P, 
the function should depend less strongly on these variables. Thus, the 
development and implementation of algorithms involving the connectivity 
of rectangular neighborhoods is a complex task. 

This task becomes even more complex when the triangular array is 
employed. In order to determine whether or not a basis point operation 
will alter the connectivity of a triangular array, one must know the 

states of three nearest neighbors at distance unity, six neighbors at I 

•] 

distance and three neighbors at distance two. Therefore, the tri- 
angular array is not normally used in pattern recognition algorithms. 

In contrast, the uniform hexagonal digitization pattern proposed by 
Golay [10] yields a basic neighborhood (Figure 3.2) of six points which 
are equidistant from the basis point. As a result, the connectivity i 

properties of the hexagonal neighborhood are readily defined as a func- ] 

1 

tion of the basis point and its six nearest neighbors. The single dis- j 

I 

advantage of the hexagonal array is that the natural coordinate axes do i 

i 

j 

i 

. I 
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Figure 3.2 Six point hexagonal neighborhood. 
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not correspond to an orthogonal system. Therefore, the hexagonal array 
is favored In applications, such as the extraction of earth resources 
data from satellite pictures, where the required algorithms do not rely 
on orthogonal coordinates. 


Golay Transforms 

The Golay transform is based on the set of 14 rotationally indepen- 
dent patterns of zero and one states which can occur in the surround 
(neighborhood) of a basis point within a hexagonal array. As shown in 
Table 1.1, page 13, each pattern is assigned a characteristic index and a 
v;eight which indicates the number of distinct orientations of the pattern 
that can be obtained by rotating the illustrated surround. The sum of 
the weights is 64, the total number of possible neighborhood patterns. 
Because all patterns of the same index are considered equivalent, Golay 
transform operations are invariant under rotation of an image through 
angles v/hich are multiples of 60°. 

The connectivity property of a Golay hexagonal neighborhood is 
uniquely defined by the state of the basis point and the index of the 
surround. For example, changing the state of a basis point within a 
neighborhood of index six or less cannot alter the connectivity of the 
array. However, the connectivity of the array can be altered by changing 
the state of a basis point with a neighborhood of index seven or greater. 
A knowledge of both the state of a basis point and the index of its sur- 
round is sufficient to determine the effect which a particular pattern 
transform operation will have on the connectivity of a neighborhood with 
basis point P, if and only if the neighbors of P are not simultaneously 
transformed as a function of P. The Golay transform satisfies this 
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condition by dividing the hexagonal binary data array into a number of 
subfields, each of which contains only non-adjacent points. Three sub- 
fields are commonly used although, as shown in Figure 3.3, the array can 
also be divided into four or seven subfields. All data points within a 
particular subfield are processed in parallel. However, only one sub- 
field is transformed at a time to avoid the potential logical conflicts. 

Thus far this discussion has assumed a simple Golay transform in 
which the surround of each basis point of an image, k, is equivalent to 
the hexagonal neighborhood of that point. The Golay transform also per- 
mits more general compound operations in which the surround of a basis . 
point of image k can be defined as a logical function of the hexagonal 
neighbors of the corresponding points in a multiplicity of other binary 
images, p,q,..,,u,v. The compound Golay transform can be utilized in 
processing multiple grey level images which have been digitized. 

The general hexagonal parallel pattern transformations [10] are 
basis point operations which are performed simultaneously on all points 
within one subfield of an image and then sequentially on each subfield. 
Golay transforms are expressed in the form 

»i(L,) ■ ‘■2 ; ‘■3 + “i(Li) • 4"] 

where M stands for the general basis point operation. The 3g - a^^ 
scripts of M represent the indices of the basis point's surround for 
which an operation will be performed. Subscript n stands for the number 
of iterations required. Each iteration involves performing the indicated 
operation on each of the specified subfields in turn. When n is not 
specified, the operation is performed only once. If n is replaced 
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Figure 3.3 Hexagonal arrays of three, four, and seven subfields. 
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by the symbol « Instead of a number, the operation is performed until 
the image ceases to be transformed by further iterations. 

In Equation (1), k represents the binary state of the basis point 
being operated on. When a simple Golay transform is being performed, 

L-] is a logical function of one or more of the hexagonal neighbors of 
the basis point in image k. In the case of a compound Golay transform, 
can also be a function of the hexagonal neighbors of corresponding 
basis points in images p,q,...,u,v. In general, = L-j (k,p,q, . . . ,u,v, 
and = k implies a simple Golay transform. The surround of the 
basis point consists of the six outputs of the function which corre- 
spond to hexagonal neighbors a-f in the simple Golay transforms. The 

term i(L^) is the index of the surround of the basis point on which the 

operation is being performed. When the index of the surround is listed 

as a subscript of M, a^. j = 1; otherwise, a^. j = 0. 

L 2 specifies which one of the three, four, or seven subfields is 
currently being transformed. For an image divided into four subfields, 

Lg will be true for only one-fourth of the image points at any one time. 
When the operation is to be performed on all subfields of an image, a 
superscript of three, four, or seven may be given with M instead of 
specifying L 2 . can be either a control signal or a logical function 
of the various images utilized in compound Golay transforms. 

A number of useful Golay transforms are given in [10] and [11]. One 
example is the simple skeletonizing operation defined by 
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k = a 


i(k) 
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When this algorithm is applied to a binary image plane, all simple blobs 
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are reduced to a single logic one point, and all blobs with holes are 
reduced to one or more loops consisting of a single layer of logic one 
points. Figure 3.4 demonstrates the application of this algorithm. The 
superscript of M specifies that the points in the binary image are 
assigned to one of three subfields by the numbering scheme shown in 
Figure 3.3, page 42, and that the subfields are processed sequentially 
in ascending order. The first subscript of M indicates that the basis 
point operation is to be performed whenever the index of the surround is 
one, two, or three. Therefore, whenever i{k) - 1, 2, or 3 

otherwise 

As specified by the second subscript of M, the skeletonizing operation is 
performed iteratively until the image becomes stable. When each subfield 
is processed, points within that subfield are set to zero unless they are 
currently in state one and have a surround whose index is not one, two or 
three. Figure 3.4 shows only one iteration of the algorithm. However, 
this simple image will not be transformed further by another iteration, 
so the final skeleton is the result of the third subfield operation. 

Although the skeletonizing algorithm is a simple Golay transform, 
its tse logic implementation must embody the general Golay transform 
principles. In addition, hardware minimization is a particularly impor- 
tant consideration in the initial development of tse logic circuits. 
Therefore, implementation of the skeletonizing algorithm will be 
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ORIGINAL IMAGE RESULT OF FIRST SUBFIELD 

OPERATION 

00000000 00000000 



RESULT OF SECOND SUBFIELD 
OPERATION 

00000000 
0 0 0 0 0 0 0 



RESULT OF THIRD SUBFIELD 
OPERATION 


00000000 
0 0 0 0 0 0 0 



Figure 3.4, An example of the Go! ay transform skeletonizing algorithm* 
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emphasized in the tse logic designs presented in the following chapters. 
Little generality v/ill be lost since, as demonstrated in Chapter 4, any 
simple Golay transform can be implemented by a straightforward modifica- 
tion of the basic tse logic skeletonizing machine. 



CHAPTER 4 


BASIC TSE LOGIC DESIGN CONCEPTS AND THEIR APPLICATION TO 
DESIGN OF A GOLAY TRANSFORM PROCESSOR 

Every switching function can be expressed in a canonical sum-of- 
products forni, where each expression consists of a finite number of 
switching variables* constants, and the operations AND, OR, and NOT [15], 
The family of electro-optical tse logic devices proposed by Schaefer 
and Strong [1] is functionally complete since the AND, OR, and NOT op- 
erations are all available. Thus, tse circuits analogous to each of 
the major subdivisions of a conventional digital computer can be de- 
signed. A computer organization in v/hich arithmetic- logic unit, con- 
trol unit, and memory are all constructed from tse logic devices is 
conceivable. If such a computer is developed by interconnecting tse 
circuits which correspond directly to the logic circuits of a conven- 
tional computer, the tse computer will simply be a two-dimensional ex- 
pansion of the conventional computer. Each element in the n x n array 
of a tse logic device becomes one component of an elemental computer 
which is isomorphic to the original conventional computer [1]. This 
organization, however, does not realize the full power of tse logic 
since tse intercommunication devices provide the potential for a more 
sophisticated design in which data can be transferred between the ele- 
mental computers. A new computer architecture based on the unique 
characteristics of tse logic is required. 

The goal of this research is to contribute to the development of 
tse computer architectures through the design of tse logic circuits 

which perform the Golay transform skeletonizing algorithm. These tse 
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logic units v/ill be controlled by conventional logic control units. 

A two-dimensional control unit would potentially allow dissimilar 
operations to be performed on various areas of an image simultaneously 
and independently. This flexibility, hov/ever, is not essential and 
can only be obtained with a significant increase In the number of tse 
logic components. Also, before the development of a tvra-dimensional 
control unit is inaugurated, the design of tse arithmetic-logic units 
should be thoroughly understood. 

This chapter introduces some basic tse logic concepts and presents 
one tse logic implementation of the skeletonizing algorithm. Additions 
to the tse logic family proposed by Schaefer and Strong [1] are de- 
scribed. Several circuits are presented for generating Golay neighbor 
planes v/hich simplify index recognition using tse logic. 

Elementary Tse Processor Control 

Control of basic tse processing units can be achieved by providing 
control images that consist of all ones when true and all zeros when 
not true. The control image can be created by switching a light source 
that illuminates each element in a fiber optic bundle. One possible 
light source is an array of electroluminescent devices manufactured on 
a semiconductor substrate. This source is similar to the output array 
of a standard active tse logic device and is assumed to be switched by 
a single CMOS compatible control line. The state of the output tse cor- 
responds to the state of the control line. 

Some tse processing requires control of individual data points 
within the tse. A number of different techniques can be used to set 
particular data points within a tse to the logic one or logic zero 


level. One method of obtaining this control is to AND or OR the tse 
with a tse mask which contains a predetermined binary image. The tse 
mask image can be permanently stored in an electro-optical tse memory 
chip which contains an array of electroluminescent devices. A standard 
electroluminescent array based on the light source described earlier 
could be prograimied to produce any required tse mask. Typically, the 
array would be programmed by specifying the final metal ization pattern 
used in the manufacturing process to define the power connections to 
elements of the array. The same general technique is currently employed 
in the production of semiconductor read-only-memories. Alternately, the 
tse mask can be produced from an all logic one electroluminescent array 
by placing an "opaque and clear" photographic film mask between the ar- 
ray and the fiber optic image path. In cases where the only operation 
required is to set particular points within the data tse to zero, a film 
mask can be placed directly in the data path. Still another possibility 
is to design the active tse logic devices so that selected points within 
their output electroluminescent arrays can be permanently set to either 
logic level during manufacture. This might be accomplished by altering 
the final metal ization pattern as described above or by forming the 
final integrated circuit interconnections using the recently developed 
dye laser micro-welder [16]. 

In both conventional and tse computer architectures, conditional 
operations are often performed when a zero state is detected. Signals 
to control these operations must be derived from a device which is ca- 
pable of detecting an empty or all zero tse. The contractor and spil- 
ler devices described by Schaefer and Strong [1] can be combined to 
perform this function. The practical logic circuit implementations of 
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these devices produce long propagation delays which could seriously 
limit the throughput of a tse circuit. Therefore* a new tse device 
which produces an output image that is all ones if and only if the in- 
put image contains at least one logic one point should be developed. 

This device is referred to as a total spiller. A primary objective in 
the development of a total spiller should be to minimize propagation 
delay through the circuit. 

The timing relationships between tse control signals will depend 
upon the propagation delay introduced by each tse logic device 7n the 
circuit. At this time these propagation delays are not well defined; 
however* the response time of the electroluminescent material in active 
tse logic devices is expected to be the controlling factor. For this 
reason, the response times for the various active devices are likely to 
be approximately equal. Since some knowledge of propagation delays is 
required for efficient circuit design, all active tse logic devices ex- 
cept the spiller* combiner, and total spiller are assumed to introduce 
a maximum of one unit propagation delay. Because the spiller, combiner, 
and total spiller operations are more complex than the operation of a. 
typical gate, a propagation delay of two units has been arbitrarily as- 
signed to these devices. Images will be carried through tse buses and 
passive devices, such as sliders and interleavers, at the speed of light, 
so zero propagation delay is assumed for passive devices. These assump- 
tions allow relative comparisons of the performance of tse logic designs 
and permit the design of conventional logic control units for tse circuits. 
Although control unit timing should be modified as the characteristics 
of real tse devices become known, the control techniques illustrated in 
these designs will be useful in the development of future control units. 
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Tse Memories 

In the initial stages of tse logic development, integrated read- 
and-v/rite tse memories are not expected to be available because of 
their complexity. Tse memory requirements v/ill be met by using the 
integrated circuit one tse read-only-memories described in the previous 
section and by constructing read-and-wri te tse memories from standard 
tse devices. The simplest read-and-wri te tse memory is the OR latch 
illustrated in Figure 4.1. This device stores one binary image in a 
feedback path v/hich is controlled by CMOS compatible signal C. The in- 
put path is controlled by a second CMOS compatible signal labeled E. 
Normally, the tse ROMs switched by C and E would contain all ones; how- 
ever, special image patterns could also be used or the control input 
images could be generated by another tse circuit. When E = 1 and C = 1, 
the output image, Q^, is Q + I (Q OR I). This property of the OR latch 
can be used to design logic circuits in which partial operations are 
performed sequentially and ORed into the tse latch to produce the final 
result. 

A comparable tse latch based on the AND gate is shown in Figure 
4.2. The input and feedback paths are controlled by CMOS compatible 
signals H and S, respectively. As specified by the control table, the 
output of the AND latch is Q • I (Q AND I) when H and S are both zero. 
This property of the AND latch can be used to form the final result of 
a tse operation by logically ANDing the results of a set of sequential 
operations. Although an objective of tse logic design is to attain 

***The images stored in each tse read-only-memory used in this 
dissertation are defined in Appendix B. 
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CONTROL TABLE 


H S 



0 0 
0 1 
1 0 
1 1 


Q . . • I . • 




Figure 4.2 One tse AND latch 







high speed operation through parallelism, serial logic circuits are 
often necessary for component minimization. Throughput can remain high 
because of the parallel nature of the tse logic devices themselves. 

A major disadvantage of the basic latch circuits is that the out- 
put image is destroyed before storage of the input image can be guar- 
anteed. In circuits where the next input to the latch depends upon the 
present output of the latch, incorrect operations can result. A master 
slave memory, such as the one shown in Figure 4.3, can be used to avoid 
this difficulty. The timing diagram in Figure 4.3 illustrates the con- 
trol signal sequence for storing a new tse in the master-slave memory. 
Timing is based on the follov/ing conservative, worst- case assumptions 
about the active tse logic devices: 

Minimum propagation delay - 0 units 
Maximum propagation delay - 1 unit 
Minimum turn-on time - 0 units 

Maximum turn-off time - 1 unit 

The assumption of a minimum propagation delay of zero assures that cor- 
rectly generated control signals will not permit race conditions to 
develop in tse circuits. In tse timing diagrams, the turn-on and turn- 
off timing constraints are illustrated by showing control signal state 
changes as if they occur over a time span of one gate delay. Actually, 
control signal state changes will occur instantaneously, and the state 

of the tse logic device being controlled can change at any time up to 

■ 

one gate delay later. 
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Hexagonal -T0"Rectangu1ar Array Transformation 

Standard tse logic devices utilize a rectangular array of binary 
data points and are not directly compatible with the hexagonal array 
employed in the Golay transforms. This difficulty can be overcome by 
noting that a rectangular array will be created from the Golay hexa- 
gonal digitization pattern if the data points in even numbered rows 
are shifted one-half unit distance to the left. The Golay neighborhood 
of each basis point, P, in the resulting rectangular array can be de- 
fined using the knowledge of which points formed the neighborhood of P 
in the original array. Subfield assignments and typical neighborhood 
patterns for rectangular arrays with three, four, and seven sufafields 
are shown in Figure 4.4. Note that the neighborhood pattern for cor- 
responding points in arrays of three and seven subfields are identical. 
Also note that data which is originally acquired through a hexagonal 
digitization pattern can be processed using the square tse logic array 
without introducing error. Data acquired through a scanning process 
can be automatically stored in a square array. If the data acquisition 
process is parallel, a special fiber optic bundle could be designed to 
convert the hexagonal array into a square array. Alternately, one 
could utilize a standard rectangular digitization pattern and merely 
assign particular Golay neighbors to each basis point. For sampling 
intervals which are small compared to the critical dimensions of inter- 
est in the binary image, the minor errors introduced by this process 
should be acceptable. 


57 


1 /2— 3 
/ , 1 

1 2 

3 

1 

2 


-1 

1 

2 

1 

2 

1 

2 

3 1 2 
i 

3-lv 2 
J _\ 

3 

1 

? 

4 3 

4 .3- 

1 

3 

4 

1 ^2—3 

1 2 

3 

1 

2 

1- 

-2 

1 

2 

1 

.2 

1 

2 

3 1 2 

1 / 
3-Yl 

3 

1 

3 

4 

3 

1 

4- 

-3"^4 

3 

4 

1 2 3 

1 2 

3 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

3 1 2 

3 1 

2 

3 

1 

3 

4 

3 

4 

3 

4 

3 

4 

1 2 3 

1 2 

3 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

3 1 2 

3 1 

2 

3 

1 

3 

4 

3 

4 

3 

4 

3 

4 


(a) Three Subfields 


(b) Four Subfields 


1 y2—3 
4’ 5 i 


4 5 6 


e'^y—i 


2 

4 
7 
2 

5 


3 

5 
1 
3 

6 


7— K2 
I I \ 
■ H 3 4 

-f/ 1 


5- 

7 1 2 
3 4 5 
5 6 7 
1 2 3 


7 

3 

5 
1 

3 

6 
1 

4 


4 

6 

2 

4 
7 
2 

5 


(c) Seven Subfields 

Figure 4.4 Subfield assignments and neighborhood patterns for 
rectangular arrays with three, four, and seven subfields. 
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Golay Neighbor Planes 
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Go! ay’s modular operations are normally functions of the index of 
the basis point, P. As a result, a Golay transform processor must be 
capable of performing logical operations on a point P.. as a function 
of its coplanar Golay neighbors which, in the case of four subfields 
for example, are located at positions P^j P^._^ y P._^ P^. , 

^i+j y ^i+1 j-1* constrast, binary tse logic operations are 
performed on separate image planes in parallel, and the state of a point 
in the output array is only a function of the state of corresponding 
points in position Y. . of the input image arrays. Thus, for index rec- 

* J 

ognition using tse logic, the state of each neighbor of a basis point, 
P.--*, should be available in position Y. . of a separate image plane. 

I J I J 

Six Golay neighbor planes (GA, GB, GC, GD, GE, and GF) are required to 
represent the state of each neighbor of every basis point in a tse. 

A tse logic circuit v/hich will produce the six Golay neighbor 
planes for a tse divided into four subfields is shown in Figure 4.5. 
Plane GA is formed by sliding the input image right one element. This 
action transfers a point in position P. . , of the original image to 

\ I 

position P.. of plane GA. Therefore, each point in plane GA is the A 
* J 

Golay neighbor of the point which occupies the same relative location 
in the original tse. Similarly, plane GB contains the B Golay neigh- 
bors, and so forth. The input image is assumed to have a one element 
deep border of dummy zeros since a complete neighborhood cannot be de- 
fined for the border elements. 

When a tse is divided into three or seven subfields, a somewhat 
more complex Golay neighbor planes generator circuit is required because 


PRODUCIBILri'Y OF THE 
PAGE IS POOR 
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Figure 4.5 Golay neighbor planes generator for rectangular arrays 
with four subfields. 
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the surrounds of basis points in even and odd numbered rows are oriented 
In opposite directions. The A, B, C, D, E, and F Golay neighbors of 
basis point are and P, re- 

spectively, If 1 is even but are 

^1+1 j’ ^1+1 d-1* >"f^spectively, if i is odd. Since P^. is neigh- 
bor A in each case, plane GA can be created by sliding the original tse 
right one element. However, sliding the input tse down one element in 
an attempt to create plane GB actu^niy results in an image which con- 
tains B Golay neighbors in even rows and C Golay neighbors in odd rows. 
To. obtain a tse with B Golay neighbors in odd rows, one can slide the 
original image down one element and to the right one element. By com- 
bining the odd rows of this tse with the even rows of the previous 
image, a single tse which corresponds to Golay neighbor plane GB can 
be obtained. This process is illustrated in Figure 4.6 which is a 
schematic for one possible Golay neighbor planes generator circuit for 
images divided into three or seven subfields. As shown in Appendix B, 
the one tse read-only-memory labeled ME consists of all ones in odd 
rows and all zeros in even rows. Mask MO is the complement of ME. The 
Type 1 Golay neighbor planes generator circuit requires 38 active and 
29 passive tse logic devices. Based on the standard assumption of one 
unit gate delay for each active tse logic device, the complete circuit 
has a propagation delay of six gate delays. 

Figure 4.7 illustrates the advantages which can be obtained by re- 
placing active tse read-only-memory masks with film masks inserted 
directly in the signal path. The boxes labeled FE and FO represent 
photographic film masks which contain the same patterns as the active 
tse masks ME and MO shown in Figure 4.6. An opaque area of the film 





























Figure 4.7 Type 2 Go! ay neighbor planes generator for rectangular 
arrays with three or seven subfields. 
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forces the corresponding tse data point to zero by blocking the light. 
Clear areas of the film have no effect on their corresponding tse data 
points. This implementation of the Golay neighbor planes generator 
circuit requires 22 active and 21 passive tse devices. Circuit propa- 
gation delay is five gate delays, one less than for the Type 1 circuit. 
The technique of programning selected points in the output array of an 
active tse device to one or the other logic level could be used to im- 
plement the Type 2 Golay neighbor planes generator without photographic 
masks. 

The Golay neighbor planes generator circuit can be simplified even 
further by using the proposed new tse logic device illustrated in Figure' 
4.8. An EXCHANGE gate is a passive tse intercommunication device with 
two tse inputs (A and B) and two tse outputs {A and B). The EXCHANGE 
gate is constructed from optical fibers using the same techniques em- 
ployed in the construction of tse interleavers and image buses. Out- 
put image path A contains the fibers which carry the data from the odd 
rows of the image A input and the fibers which carry the data from the 
even rows of the image B input. The B output path contains fibers from 
the even rows of A and the odd rows of B. Thus, the EXCHANGE gate per- 
forms the operation of exchanging the data in the alternate rows of 
two images. This is one of the operations required to generate the 
Golay neighbor planes for tses divided into three or seven subfields. 

A schematic for the equivalent circuit of the row EXCHANGE gate is pre- 
sented in Figur.. 4.9. Since the tse array is square, a similar column 
exchange operation can be obtained by rotating the EXCHANGE gate fiber 
optic array 90^' with respect to the image buses interfaced to the in- 
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puts and outputs of the device. The EXCHANGE gaie Illustrates that 
special fiber optic arrays can be constructed to perform certain tse 
logic functions that v/ould normally require several active devices and 
tse masks. The special tse logic devices reduce circuit power consump- 
tion and propagation delay. 

Exchange gates are employed in the third type of Go! ay neighbor 
planes generator circuit shown in Figure 4.10. Only one output of each 
EXCHAf-lGE gate is used. This implementation of the Golay neighbor 
planes generator requires only 18 active and 21 passive tse logic de- 
vices. Propagation delay is reduced to four gate delays by the passive 
nature of the EXCHANGE gates. A comparison of the three types of Golay 
neighbor planes generator circuits is provided in Table 4.1. 

Index R ecognition 

With the neighborhood points tor each element of an image avail- 
able in the six Golay neighbor planes, the task of designing circuits 
to perform any binary function of a basis point and the basis surround 
redures to relatively standard logic design problem. Tse logic cir- 
cuits can be designed to recognize any surround index or combination 
of indices. An index recognition circuit can be a sequential circuit 
or a pure combinational circuit. The major distinction between the 
conventional logic design procedure and the required tse logic design 

procedure is that many of the traditional minimization procedures do 

■* 

not apply to tse logic design because of the fan-in and fan-out restric- 
tions. An important similarity between conventional and tse logic 
circuit designs is that a trade off generally exists between operating 
speed and circuit complexity. This fact is demonstrated by the index 
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Figure 4.10 Type 3 Golay neighbor planes generator frr r 
arrays with three or seven subfields. 
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TABLE 4.1 


CflAllACTERISTI-CS GF THE GOLAY HEIGHEOR PLANES GENERATOK 
CIRCUITS FOR RECTANGULAR ARRAYS OF THREE 
OR SEVEN SUBFIELDS 



Ci rcui t 

Figure 

Number of 

Number of 

Propagation Delay in 

Type 

Number 

Passive Devices 

Active Devices 

Standard Gate Delays 


Type 1 

4.6 

29 

38 

6 

Type 2 

4.7 

21 

22 

5 

Type 3 

4.10 

21 

’S 

4 


Cl 

CO 



59 


recognition circuits presented below. 

The Golay transform skeletonizing algorithm requires recognition 
of basis points which have a surround index of one, two, or three. 

Figure 4.11 $hov/s the scheraatic of a combinational circuit which pro- 
duces an output tse with zeros marking the positions of basis points 
that have an index of one, two, or three. Each cell in the space iter- 
ative circuit recognizes one of the six rotationally equivalent orienta- 
tions of a surround with index one, tv/o, or three. Six cells are re- 
quired to recognize all of the possible orientations of the surrounds. 
This index recognition circuit has a propagation delay of only ten gate 
delays but requires 125 active and 77 passive tse logic devices. 

Figure 4.12 illustrates a time iterative index recognition circuit 
Which determines the index by comparing the input tse to tse masks. 

Each Golay neighbor plane is EXCLUSIVE-O.Red vnth '3 tse mask to produce 
a tse v/ith logic one points marking positions where the neighbor plane 
and the mask have complementary values. These six planes are ANDed 
together to form the input to the tse latch. The tse latch input con- 
tains logic one points only in positions where none of the corresponding 
Golay neighbor plane points match their corresponding masks. If all 
of tho Input masks are turned on, the tse latch input image will con- 
tain logic one points in the positions of basis points with a surround 
of Index zero. Each of the 54 possible combinations of input mask 
states correspond to one orientation of one of the 14 possible indices. 
All basis points with a particular surround index can be recognized by 
ser^Ltentially checking for every orientation of that surround and ORing 
the partial results into the tse latch. The timing diagram shown in 
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Figure 4.13 illustrates the control sequence required to recognize an 
index with a weight of three. To perform the skeletonizing algorithm* 
basis points with index one, two, or three must be recognized. This 
procedure takes 215 unit gate delays but requires only 61 active and 
33 passive tse logic devices (Table 4.2). An important feature of this 
index recognition circuit is that any index or combination of indices 
can be recognized. With the addition of a NEGATE operation, recognizing 
that the index of a basis point is not an element of a specified set 
A is equivalent to identifying the index as an element of set B where 
the union of A and B is the set of all 14 indices. The upper bound on 
the time required to recognize all basis points with any given set of 
indices is established by the time required to identify half of the 64 
possible surround orientations. Thus, in the worst case, 383 unit gate 
delays are required to identify all basis points with indices from a 
particular set. 


Golay Function 

Once the index recognition circuits have been designed, a Golay 
function circuit can be developed to perform the logical operations 
specified by a particular Golay transform. The Golay function circuit 
for performing the swelling operation [10] defined by the symbol 

" ®i(k) ®i(k) ■ 

is illustrated in Figure 4.14. As shown in Figure 4.15, the skeleton- 
izing algorithm requires an even simpler Golay function circuit. 
Similar Golay function blocks can be designed for any Golay transform. 


4 



TABLE 4.2 


CHARACTERISTICS OF THE SPACE AND TIME ITERATIVE 
INDEX RECOGNITION CIRCUITS 


Ci rcui t 

Figure 

Number of 

Number of 

Time Required to Recognize 
Indices 1, 2, and 3 

Type 

Number 

Active Devices 

Passive Devices 

(Unit Gate Delays) 

Space Iterative 

4.11 

125 

77 

10 

Time Iterative 

4.12 

61 

33 

215 
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Figure 4.14 Golay function circuit for a swelling operation. 
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Figure 4.15 Golay function circuit for a skeletonizing operation. 
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Note that although the Go! ay function is performed simultaneously on 
all points of the layer input image, the results at any one time are 
only valid for one subfield v/ithin the image. The Golay function must 
normally be performed sequentially on each subfipld of the image. A 
hardwired tse logic circuit which illustrates the procedure for per- 
forming a Golay transform is presented in the next section. 

Implementation of the Skeletonizing Algorithm 
The task involved in the Golay transform skeletonizing algorithm, 

«l3. C" " (6) 

is to shrink all simple blobs in the binary image plane, K, to a single 
one, while reducing blobs with holes to one or more loops consisting 
of a single layer of ones. During an iteration through the algorithm, 
the specified modular operation (Golay function) is performed in par- 
allel on the points within each of the three subfields in sequence. 

As each subfield is processed, the basis points within that sufafield vvhich 
are in state one and have a surround index not equal to one, two, or 
three are allowed to remain in state one v/hile all other points in that 
subfield are set to zero. The process is complete when further itera- 
tions produce no change in the output image. 

A block diagram of the tse logic implementation of the skeleton- 
izing algorithm is presented in Figure 4.16, and a schematic for the 
circuit is illustrated in Figure 4,17. This particular design is 
intended for high speed processing and, therefore, utilizes the space 
iterative, combinational index recognition circuit. The timing diagram 


LAYER 



Figure 4.16 Block diagram of a tse logic implementation of the skeletonizing algorithm 
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Figure 4.17 Schematic of a hardvnred si 
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provided In Figure 4.18 shows the sequence of control signals required 
by the skeletonizing machine. 

The operation of the skeletonizing machine can be explained by 
follovnng an image, K, through one iteration of the skeletonizing al- 
gorithm. Assume that the binary image labeled K in Figure 4.19 is 
available at the layer input of the skeletonizing machine. Upon com- 
pletion of the current task, the layer input logic will automatically 
gate (t = 0) the new image plane K into latches A and k' for temporary 
storage. Golay neighbor planes are generated using EXCHANGE gates 
(t *= 18), and basis points v/ith a surround whose index is not one, two, 
or three are identified by the index recognition circuit. At t = 28 
the output of the index recognition circuit is ANDed v/ith image K to 
obtain the Gotay function output image, F(t = 29). Although the entire 
image is being processed, only the first subfield results are valid. 
Image F propagates unaltered through the subfield multiplexing circuit 
and the layer input logic to the input of latch A. Subfield one of 
image F" is then ANDed into latch A(t = 48) by manipulating the tse 
mask control lines so that only the first subfield portion of the reg- 
ister A input control image changes. Latch A now contains subfields 
two and three of the original image, K, and a new subfield on which 
the Golay modular operation has been performed. The first subfield 
operation is now complete. The image created by the first subfield 
operation is allowed to propagate back through the circuits described 
above with the result that the second subfield is processed and ANDed 
back into latch A(t = 82), After the completion of the second subfield 
operation, the resultant binary image is again processed by the Golay 
neighbor planes generator (t = 86), the index recognition circuit 
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Figure 4,19 Images formed during one ■iteration of the skeletonizing algorithni. 
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(t = 96), and the Golay function circuit (t = 97). Since the third 
subfield operation marks the completion of one iteration, the result- 
ant subfield is not ANDed directly back into latch A. Instead, the 
third subfield of image F is combined with subfields one and two from 
image latch A by the subfield multiplexing circuit to produce tse 
F"(t « 101). Image F' is EXCLUSIVE-ORed (t = 106) with the original 
image K which has been stored in latch A". The resultant image con- 
tains a one in the position of any basis point which changed state as 
a result of a modular operation performed during the current iteration. 
If no points have been altered, image F' is the Golay skeleton of the 
original image K and processing is complete, .‘he total spiller pro- 
duces a control tse which allows a new image, labeled J, to be accepted 
for processing and causes the layer output true image to become all 
ones. However, if any basis point was altered during the current 
iteration, another iteration is required. In that case, the control 
image produced by the total spiller allows image F"* to be stored in 
latches A and A'*(t = 126). Processing then continues as described 
above until the skeleton of the image !s obtained. 

Using this skeletonizing machine, each iteration of the skeleton- 
izing algorithm takes 112 unit gate delays. The total image processing 
time is, of course, dependent upon the number of iterations required 
to obtain the skeleton of a particular image. Processing time is not 
a function of the size of the tse array. A total of 208 active and 
134 passive tse components are required for this implementation of the 
skeletonizing operation. Of these, 116 active and 58 passive compo- 
nents are required because of the fan-in and fan-out limitations of 
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the electro-optical tse logic devices. The number of basic tse logic 
components required to perform the skeletonizing algorithm can be 
reduced to 146 active and 91 passive devices by using the comparison 
type index recognition circuit and a negator. The time for each iter- 
ation is increased to 679 unit gate delays. 

Evaluation of Skeletonizing Machines 

Adequate evaluation and comparison of skeletonizing machines re- 
quires a set of performance parameters. Several performance charac- 
ten sties of the skeletonizing machine described in this chapter are 
given in Table 4.3. A hardware cost function defined by 

Hardv/are Cost = Ax + By + Cz 

where x = number of active tse devices 

y « number of passive tse devices 

z = number of unit lengths of fiber 
optic bundle 

and the constants A, B, and C are v/eighting 
factors which are determined from the ac- 
tual size and price of the components 

is proposed for comparison of various machines. This cost function 
provides a more complete representation of circuit size and power con- 
sumption than a total component count. The distinction between active 
and passive tse devices is maintained because active devices consume 
povver without contributing significant weight or bulk, whereas the 
passive devices consume no power but contribute substantially to cir- 
cuit weight and bulk. The interconnecting fiber bundles are also 
important because of their weight and bulk. The length of fiber bundles 


TABLE 4.3 


BASIC HARDWIRED SKELET0MIZIN6 MACHINE PERFORMANCE CHARACTERISTICS- 
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required for a particular circuit cannot be accurately determined at 
this time, and, therefore, is not included in Table 4.3. 

The relative speed of various skeletonizing machines can be most 
accurately determined by considering the average rate at which simple 
images can be processed. A simple image is defined as any image whose 
skeleton is itself. Only one iteration is required to process a simple 
image. The simple image processing operation represents the most 
basic task which includes all operations required in the general skele- 
tonizing algorithm. Note, however, that the time required to process 
a typical image will be much greater than the time required to process 
a simple image. 

Both peak and average power consumption data are provided in Table 
4.3. Peak power consumption provides an indication of the relative 
cost and size of the power supplies required for the circuit. Average 
power consumption is important for space applications and in battery 
powered equipment where the total available energy is limited. A gen- 
erally accepted figure of merit which includes consideration of machine 
speed and power consumption is provided by the speed-power product. 

The speed-power product listed in Table 4»3 reveals the energy which 
must be expended to perform one skeletonizing operation. The time and 
power consumption figures given in Table 4.3 are based on the five 
millisecond propagation delay and three watt power consumption pro- 
jected by NASA [1] for early prototype tse devices. 


CHAPTER 5 


IMPLEMENTATION OF THE SKELETONIZING ALGORITHM USING 
TSE LOGIC DEVICES WITH A DISABLE INPUT 

The tse logic control technique described in Chapter 4 permits con- 
ventional logic control of tse circuit operations while requiring the 
CMOS compatible output disable signal to be provided for only onp basic 
tse device, the electroluminescent array read-only-memory. Eatn control 
signal requires one passive and two active tse devices {Figure 5.1) to 
complete the interface with the tse circuit. One unit gate delay is also 
added to the tse data path. Two improved skeletonizing machines and a 
technique for reducing the number of tse devices required for the control 
signal interface are presented in this chapter. In addition, a conven- 
tional logic organization for the control units of dedicated tse logic 
circuits is developed, and several index recognition circuits V'/hich illus- 
trate the trade off bet\-/een circuit complexity and operating speed are 
characterized. 


Improved Conventional Logic Control Signal 
Interface to Tse Circuits 

A majority of the control signals applied to tse logic circuits are 
used to disable image propagation through selected data paths. This goal 
could also be accomplished by deactivating the output electroluminescent 
array of the active tse logic devices from which the image is originat- 
ing. A CMOS compatible control line which activates the array when in 
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Figure 5.1 Basic conventional logic control signal inter- 
face to tse circuits. 
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the logic one state and deactivates the array when in the logic zero state 
can be provided for each active tse logic device. Figure 5.2 illustrates 
the hardware cost reduction obtained when a tse OR latch is constructed 
using this control technique. Note that control signal E is applied to 
the active tse device which creates the input image. The hardware cost 
function for the improved one tse OR latch is 3A + 2B compared to a hard- 
ware cost of 7A + 4B for the basic one tse OR latch (Figure 4.1, page 52). 
Circuit operating speed is also improved by this control technique. One 
disadvantage of the control technique is that the complexity of tne inte- 
grated circuit, active tse logic devices is increased. This disadvant- 
age should be offset by the reduced tse logic power consumption which will 
be obtained v/hen the electroluminescent array is deactivated. The power 
consumption of a deactivated tse device could be reduced to essentially 
zero by allowing the control signal to power-down the logic circuit por- 
tion of the tse device as well as the electroluminescent output array. 

An improved one tse AND latch which uses the proposed control tech- 
nique is illustrated in Fig. 5.3. Note that control signal S is inter- 
faced to the feedback tse data path using the same technique employed in 
the basic one tse AND latch (Figure 4,2, page 53). This control signal in- 
terface could be simplified by assuming that some active tse logic devices 
are available which are enabled for normal operation or forced to produce 
an all logic one output tse depending upon the state of a control line. 

An increase in the complexity of the active tse logic devices would be 
required. Since this increase will not be associated with additional ad- 
vantages such as reduced power consumption, the assumption that this class 
of tse logic devices will be available will not be made at this time. 

After additional experience has been gained in the design of tse logic 
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Figure 5.2 Improved one tse OR latch 
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circuits and in the fabrication of tse devices, a decision can be made 
concerning the development of tse devices with this capability. 

Some Additional Improved Tse Memories 

The circuit schematic and timing diagram for an improved one tse 
master-slave memory is presented in Figure 5.4. Only ten unit gate 
delays are required to store a nev/ image in this memory v/hereas 15 unit 
delays are required to store an image in the basic master-slave tse mem- 
ory (Figure 4.3, page 55). The cost function of the improved memory is 
6A + 4B compared to a cost function of 14A + 8B for the basic master-slave 
memory. 

Master-slave tse memories can be chained to create tse shift regis- 
ters of any desired length. An example is the six tse circular right 
shift register illustrated in Figure 5,5. Nine active and six passive 
tse devices are required in each section of this register for a total 
cost function of 54A + 36B. Figure 5.6 illustrates the typical control 
sequences for this shift register. A functionally equivalent shift reg- 
ister constructed to use the elementary control technique would require 
21 active and 12 passive tse devices in each section for a total hardware 
cost of 126A 72B. The hardware cost of this circuit is reduced by more 

than 50 percent by adding control lines to the active tse devices. 

An alternate implementation of the six tse, circular right shift, 
parallel -input, parallel -output shift register is shown in Figure 5.7, 

This design does not exhibit master-slave capability on the parallel 
inputs; however, as illustrated in Figure 5.8, this design does have the 
advantage of parallel loading in four unit gate delays rather than the 
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nine unit gate delays required by the true master-slave design. The same 
number of tse components are required to construct either circuit. 

Improved Index Recognition Circuits 

As demonstrated in Chapter 4, index recognition circuit character- 
istics are important factors in determining the hardware cost and data 
rate of a tse logic implementation of the skeletonizing algorithm. The 
space iterative index recognition circuit (Figure 4.11, page 70) requires 
only ten unit gate delays to recognize basis points with a surround of 
index one, two, or three but has a hardware cost function of 125A + 77B. 

A number of index recognition circuits which offer a range of choices in 
the trade-off between circuit complexity and operating speed are desir- 
able since component minimization is an important goal in the development 
of tse logic circuits. The basic technique for reducing the amount of 
hardware required to implement an index recognition circuit is to design 
the circuit to operate in a time sequential mode. 

Figure 5.9 shows an index recognition circuit which identifies basis 
points with a surround of index one, two, or three by checking for one 
' possible orientation of each of the three indices at a time. The order 
of the Golay neighbor plane inputs to the combinational logic portion of 
the circuit is rotated by the multiplexer circuit so that basis points 
with the correct index will be identified regardless of the orientation 
of their surround. The partial result obtained after each rotation is 
stored in the OR latch. As demonstrated by the timing diagram in Figure 
5.10, 75 unit delays are required to recognize all three indices using 
this circuit. The cost function for the multiplexed index recognition 
circuit is 104A + 69B. Thus, this circuit requires 21 fewer active 
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devices and eight fewer passive devices than the space iterative index 
recognition circuit. The multiplexed index recognition circuit is not 
practical for implementation using basic tse devices which lack a single 
line disable input since a minimum of 38 control line interfaces would be 
required. This would increase the circuit cost function to 180A + 107B 
which is 42 percent more components than the high performance space 
iterative index recognition circuit (Figure 4.11, page 70) requires. 

An alternate technique for rotating the Golay neighbor planes is 
illustrated by the shift register based index recognition circuit pre- 
sented in Figure 5,11. This circuit can recognize all basis points with 
a surround of index one, tv/o, or three in 82 unit gate delays (Figure 
5,12). The cost function for the shift register based index recognition 
circuit is 68A + 45B. Thus, a 34 percent savings in tse components can 
be realized v/ith only a nine percent increase in propagation delay by 
rotating the Golay neighbor planes with a shift register rather than a 
multiplexer. The shift register type index recognition circuit can be 
realized using the basic control technique (Figure 5.1, page 87) at a 
hardv/are cost of 140A + 81B. Such a design would be impractical, however, 
since improved performance could be acheived at a lov/er hardware cost 
using the space iterative index recognition circuit (Figure 4.11, page 70). 

The hardware cost of performing the index recognition task using the 
comparison type circuit (Figure 4.12, page 71) discussed in Chapter 4 is 
60A + 33B. As illustrated in Figure 5.13, the hardware cost of the com- 
parison type index recognition circuit can be reduced to 20A + 13B if 
integrated circuit EXCLUSIVE-OR gates are available. Because of the gen- 
eral usefulness of the EXCLUSIVE-CR function, EXCLUSIVE-OR gates are 
proposed for development as the first complex, integrated tse logic device. 
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Figure 5.12 Timing diagram for the shift register based index 
recognition circuit. 
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An improved comparison type index recognition circuit using EXCLUSIVE-OR 
gates requires only 159 unit gate delays (Figure 5,14) to identify all 
basis points with a surround of index one, two, or three. The v/orst case 
time required to identify basis points with any set of indices is only 
285 unit gate delays. 

Performance characteristics for the various index recognition cir- 
cuits that are useful in performing the Golay transform skeletonizing 
algorithm are summarized in Table 5.1. The trade off betiveen circuit 
complexity and operating speed is illustrated by the increase in propa- 
gation delay as the number of tse components required to perform the 
index recognition task is reduced. Table 5.1 also shov/s the potential 
value of including a single line disable on the active tse logic devices 
since only two of the five index recognition circuits are practical if 
the single line disable is not provided. 

An OR Latch Implementation of the Skeletonizing Algorithm 

A medium performance implementation of the Golay transform skeleton- 
izing algorithm using OR latches and the shift register type index 
recognition circuit is illustrated in Figure 5,15, This circuit operates 
in basically the same manner as the skeletonizing machine described in 
Chapter 4. The OR latch design, however, has the advantage of permit- 
ting changes in the subfield operating order. Normally the modular oper- 
ation specified by the Golay transform skeletonizing algorithm is per- 
formed on basis points in subfields one, two, or three in that order. 

If some other subfield operating order is employed, a skeleton of the 
original image will still be obtained but, depending upon the shape of 
the original image, the skeleton might be shifted to a slightly different 
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SPACE ITERATIVE 4.11 

COMBINATIONAL CIRCUIT 

MULTIPLEXED CIRCUIT 5.7 
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location within the image field. This effect could be useful in a poten- 
tial cloud tracking application which has been suggested for the 
skeletonizing machine [17], 

The location and velocity of major cloud formations is one type of 
valuable meteorological information which can be obtained from earth 
resources satellites. Traditionally, successive images of the cloud for- 
mation are transmitted to ground stations where conventional computers 
are used to compute the speed and direction of the cloud formation. If 
a tse logic skeletonizing machine is included in the satellite, cloud 
images could potentially be skeletonized in real time. A sequence of 
skeletons could be transmitted to earth as one image which would show the 
track of the air mass carrying the cloud. Since less data would be 
transmitted to earth, the bandwidth of the .transmission channel and the 
processing load on earthbound computers could both be reduced. One 
potential problem is the time varying shape of the cloud formation. The 
skeletonizing operation should consistently produce skeletons at a point 
near the geometric center of the original cloud image to facilitate 
accurate cloud tracking. An OR latch skeletonizing machine design such 
as the one illustrated in Figure 5.15 would allow several skeletons of 
each cloud image to be obtained using different subfield operating orders. 
The average skeleton of the image would then provide a more consistent 
indication of cioud position than a single skeleton. 

The OR latch skeletonizing machine illustrated in Figure 5.15 is 
more amenable to alternating subfield operating orders than an improved 
version of the skeletonizing machine presented in Chapter 4 (Figure 4.17, 
page 78) would be because the new subfield produced by the current pro- 
cessing step is always combined with the other two subfields via the 
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subfield multiplexing circuit. Th^re are three alternate paths through 
the subfield multiplexing circuit. Each path contains a film mask or an 
active tse OR gate with a programmed electroluminescent output array. 

The first image path is masked so that only those data points within sub- 
field one of the input tse will be transmitted. Data points within the 
second and third subfields are zeroed. Similarly, the second and third 
image paths transmit only data points that are within the second and 
third subfields of the image, respectively. Either the output of latch 
A or the Golay function circuit output can be selected for transmission 
through any of the image paths. Normally, the image output from latch A 
propagates through the tv/o image paths which correspond to subfields that 
are not currently being operated on, and the Golay function circuit out- 
put propagates through the remaining image path. The image paths are 
recombined at the output of the subfield multiplexing circuit to produce 
a new image which is the result of the current skeletonizing algorithm 
operation. If the current operation is the last of the three subfield 
operations, the new image can be compared to the result of the preceding 
operation to determine whether the skeleton is complete or another itera- 
tion is required. Image propagation through the three subfield multi- 
plexing circuit image paths is completely controlled by conventional 
logic signals. Thus, the control unit of the OR latch skeletonizing 
machine can be designed to select any required subfield operating order 
and change the order as often as necessary. 

The input logic required by the OR latch skeletonizing machine is 
significantly less complex than the input logic required by the machine 
presented in Chapter 4. This is partially due to the use of an inte- 
grated EXCLUSIVE-OR gate. In addition, however, note that the total 
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spiller has been replaced by a contractor device which produces a single 
CMOS compatible output. The output of the contractor is a logic one if 
any basis point changed states as a result of a modular operation per- 
formed during the current iteration of the skeletonizing algorithm. In 
that case, another iteration is required. The conversion from a total 
spiller to a contractor is a direct consequence of the assumption that 
active tse components can be deactivated by a one bit control line. As 
a further result of this assumption, some of the input logic operations 
can be performed by the conventional logic control unit, and fewer tse 
components are required to implement the input logic. 

Control of the OR Latch Skeletonizing Machine 

A timing diagram for performing one complete iteration of the 
skeletonizing algorithm using the OR latch skeletonizing machine (Figure 
5.15, page 107) is illustrated in Figure 5.16. The timing diagram includes 
control signals v^hich are used to minimize power consumption by disabling 
active tse devices when they are not in use. Each iteration of the 
skeletonizing algorithm is defined as one machine cycle, and each subfield 
operation is defined as a subcycle. Thus, there are three subcycles 
within the 315 unit gate delay machine cycle of the OR latch type skele- 
tonizing machine. 

The control unit for the skeletonizing machine can be developed 
using any of the conventional logic, sequential circuit design tech- 
niques. One state-of-the-art control unit design is illustrated in 
Figure 5.17. Control signals for the skeletonizing machine are gener- 
ated by programmable logic arrays (PLAs). A binary counter chain which 
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is driven at a clock rate corresponding to twice the inverse of the 
standard gate propagation delay provides inputs to the programmable logic 
arrays. Each unit time increment in the 3.5 unit gate delay machine cycle 
is uniquely defined by the state of the counter chain. The PLAs produce 
the control signals required by the skeletonizing machine by decoding the 
state of the counter chains the state of the contractor output, and the 
state of external control signals which start and stop the machine. An 
output signal from one PLA resets the binary counter chain at the end of 
each machine cycle. In applications such as cloud tracking where a 
fixed sequence of different subfield operating orders is required, an 
additional binary counter circuit can be used to determine the current 
subfield operating order. The programmable logic array would provide a 
clock pulse to the counter at the completion of each machine cycle. The 
PLA would then decode the new counter state to determine the subfield 
operating order required during the next machine cycle. Table 5.2 sum- 
marizes the function of each control signal produced by the OR latch 
skeletonizing machine control unit. 

An Improved AND Latch Implementation of the 
Skeletonizing Algorithm 

The OR latch type skeletonizing machine can be converted to an 
improved version of the AND latch type skeletonizing machine presented 
in Chapter 4 by redesigning register A, register A* , and the subfield 
multiplexing circuit. A schematic for the AND latch type skeletonizing 
machine is presented in Figure 5.18, and a timing diagram for one machine 
cycle is illustrated in Figure 5.19. Since a higher data rate is 
achieved with fewer tse components, this design is preferred over the 
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Figure 5.18 Schematic diagram for an improved AND latch hardwired skeletonizing machine. 
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Figure 5.19 Timing diagram for one complete iteration of the skeletonizing algorithm using the 
improved AND latch skeletonizing machine. 
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OR latch design when only one subfield operating order is required. A 
control unit for the AND latch machine could be constructed using the 
technique described for the OR latch machine. Control signals for the 
AND latch skeletonizing machine are described in Table 5.3. The 
hardvrare costs, power requirements, and performance of the AND latch and 
OR latch versions of the skeletonizing machine are summarized in Table 5.4, 
Ti.e AND and OR latch skeleton! nzing machines presented in this 
chapter are medium performance machines. In particular applications, a 
higher throughput design or a design which requires fewer tse components 
may be required. Because the basic skeletonizing machine designs are 
independent of the type of index recognition circuit used, these machines 
can be tailored to specific applications by selecting the appropriate 
index recognition circuit from Table 5.1, page 106. In low data rate 
applications, the improved comparison type index recognition circuit 
should be used to reduce hardware costs. For applications where the 
machine cycle time of the skeletonizing machine using a shift register 
based index recognition circuit is too long, the multiplexed or space 
iterative index recognition circuits can be employed at a significantly 
higher hardware cost. Alternately, the skeletonzing machine organization 
described in Chapter 7 can be utilized. This organization offers several 
performance advantages in ultrahigh data rate applications. 
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TABLE 5.3 

IMPROVED AND LATCH SKELETONIZING MACHINE CONTROL SIGNALS 
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PERFORMANCE CHARACTERISTICS OF THE IMPROVED 
AND LATCH AND OR LATCH SKELETONIZING MACHINES 
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A PIPELINED ARCHITECTURE FOR THE SKELETONIZING MACHINE 

The medium performance tse logic skeletonizing machines presented 
in Chapter 5 achieve high data processing rates using a minimum number 
of elementary tse logic devices. Hov/ever, certain applications may 
require even higher data processing rates at the expense of additional 
hardv/are. The various index recognition circuit designs described in 
Chapter 5 illustrate that the data processing rate can be improved sig- 
nificantly by increasing the hardv;are complexity of the index recog- 
nition circuit. This chapter presents a pipelined architecture for 
the skeletonizing machine which allows further improvements in the 
data processing rate through a more efficient utilization of the index 
recognition circuit. 


Multiple Tse Processing 

The Golay transform algorithms permit the transformation of only 
one subfield of an image at a time. As a result, only one third or 
less (less for the four or seven subfields case) of the elements of 
each tse logic device used in the Golay neighbor planes generator, index 
recognition, and Golay function circuits can be performing a useful 
operation on a single image at any instant. The operation performed by 
the Golay neighbor planes generator inherently restricts the number of 
simultaneous input images to one because all subfields of the input 
image are involved in the output function. Points within different 
subfields do not interact in the index recognition and Golay function 
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circuits. Therefore, these circuits are capable of processing distinct 
subfields from several different images simultaneously. 

Figure 6.1 illustrates a three plane mixer circuit which can be 
used to create a composite image for parallel processing by the index 
recognition circuit. The inputs to the three plane mixer are Golay 
neighbor planes of the same type from three separate tses. To achieve 
the minimum index recognition time, the Golay neighbor planes must be 
available simultaneously, and, thus, three Golay neighbor planes gen- 
erator circuits are required. The shift register based index recog- 
nition circuit (Figure 5.11, p. 101) provides input latches which can 
accept the three subfields of the composite tse from the three plane 
mixer sequentially. This permits the use of a single Golay neighbor 
planes generator circuit when a somewhat longer image processing time 
is acceptable. Six three plane mixer circuits are required to produce 
the composite Golay neighbor planes GA*, GB*, 6C*, 6D*, GE*, and GF*. 

One additional three plane mixer is used to create a composite image, 
QA*, of the three tses currently being processed. This image is one 
input to the Golay function logic. 

A Minimum Hardware, Modified Pipeline Implementation of 
the Skeletonizing Algorithm 

The possibility of processing distinct subfields from several dif- 
ferent images simultaneously suggest the development of a skeletonizing 
machine architecture which uses a modification of the pipeline prin- 
ciple. Figure 6.2 illustrates the traditional pipeline orgainzation 
for an image processing machine. The image processing algorithm is 
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broken down into a sequence of steps which can be implemented by suc- 
cessive logic circuits. Synchronizing registers [18] store the partial 
result obtained from each step while the succeeding processing step is 
being performed. Images are fed into the top of the pipeline, proces- 
sed as they are clocked dov/n the pipeline, and extracted at the bottom. 
This type of machine processes images at the rate at which they can be 
clocked into the machine rather than at a rate which is dependent upon 
the complexity of the algorithm. Although this organization is expen- 
sive in terms of hardv/are cost, the pipeline is' highly efficient since 
all of the gating can be utilized 100 percent of the time [18]. 

Golay transform algorithms are generally not fixed length algori- 
thms because the required number of iterations is a function of the 
image being processed. An excessively large number of synchronizing 
registers and logic circuits would be required to guarantee the com- 
pletion of the Golay transform in a straight pipeline of the type il- 
lustrated in Figure 6.2. This difficulty can be overcome by providing 
a gated feedback path for the output image. A logic circuit at the 
input of the pipeline would determine v/hethcr to gate a new image into 
the first synchronizing register or gate the current output back into 
the pipeline for additional processing. Using this technique, any 
integral number of processing stages can be used to construct a machine 
to perfom a convergent Golay transform such as the skeletonizing al- 
gorithm. 

Although feasible, a conventional pipeline organization of a Golay 
transform machine would not be efficient because only one subfield of 
an input image could be processed at a time. Thus, one- third or more 
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of the elements of each device employed in the image processing logic 
would be unused at any instant. Figure 6,3 is a block diagram of a 
unique pipeline organization for Go! ay transform processing machines. 
This organization allows one image processing logic circuit to process 
one subfield from each of three different pipelined images simulta- 
neously. The hardvmre minimization achieved by using this architecture 
is extremely important because of the high cost projected for tse logic 
devi ces . 

A block diagram of a tse logic implementation of the skeletonizing 
algorithm using the new pipeline organization is illustrated in Figure 
6.4. The Golay neighbor planes generator, three plane mixer, index 
recognition, and Golay function logic circuits comprise the image pro- 
cessing logic. Three Golay neighbor planes generators are required to 
obtain the highest processing rates. However, Figure 6.4 illustrates 
a lower hardware cost design which takes advantage of the latches 
available in the shift register based index recognition circuit to per- 
mit sequential use of a single Golay neighbor planes generator circuit. 

Latches A, B, and C are the synchronizing registers of a conven- 
tional pipeline organization. One subfield operation is performed on 
the input image as the image is clocked between each succeeding latch. 
Two plane mixer circuits, such as the one illustrated in Figure 6.5, 
are used to form the composite Images which represent the result of 
each sub '^i eld operation. In addition to the two plane mixer which pro- 
vides the feedback signal to the layer input logic, a two plane mixer 
is required at the input to latch B and latch C. Latches B and C 
preserve the images present at the beginning of each iteration of the 
algorithm for comparison to the processed images. 
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Figure 6.3 Block diagram of a modified pipeline archi tecture for tse 
logic image processing machines using Golay transforms. 
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The operation of this machine can be explained by following an 
image, <3, through one subfield operation of the skeletonizing algorithm. 
Image J is gated through the layer input logic and clocked into latch 
A. Simultaneously, other images are clocked into latches B, B**, C, 
and C"*. Now, assume that latch S'* contains a previous input image, K, 
latch B contains the result of the first subfield operation on K, K*, 
latch contains a previous input image, L, and latch C contains the 
result of the first and second subfield operations on image L, L**. 

The output of latch A is enabled and Golay neighbor planes are genera- 
ted for image J, Subfield one of each of the Golay neighbor planes 
from image d is gated through the three plane mixer network and clocked 
into the shift register type index recognition circuit. The output of 
latch A is disabled, and the output of latch B is enabled so that sub- 
field two from each Golay neighbor plane of image K* can be clocked 
into the index recognition circuit after disabling output QB of latch 
B and enabling output QC of latch C. 

Upon completion of the index recognition procedure, images J, K*, 
and L** are combined in another three plane mixer, and a composite 
image consisting of subfield one of image J, subfield two of image K*, 
and subfield thres of image L** is provided to the Golay function logic. 
Image F, which is formed by the Golay function logic, consists of the 
first, second, and third subfields of images J, K*, and L**, respec- 
tively. The first subfield of image F and the second and third sub- 
fields of image 0 are combined by a two plane mixer for clocking into 
latch B. Subfield two of image F is combined with subfields one and 
three of image K* by another two plane mixer for clocking into latch C. 
Image L** has just completed one iteration of the skeletonizing algori- 
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thm, so subfield three of image F is combined with the previously pro- 
cessed subfields one and two of image L** to form image F** which is the 
result of the current iteration performed on image L. Image F** is 
EXClUSIVE-ORed vnth image L to detect any differences in the images. 

If no differences are detected, image F"* is the skeleton of image L 
and v/ill be gated out of the machine as a new image is clocked into 
latch A. If a difference is detected, image F’’ is gated into the in- 
put of latch A for additional processing. Latches A, B, C, B" and C' 
are then clocked to store new images in them, and processing continues 
as described above. At this point latch contains image J, latch 
B contains image J* v/hich is the result of the first subfield opera- 
tion on image O, latch C' contains image K, latch C contains image K** 
which is the result of the second subfield operation on image K, and 
latch A contains either a new image or the result of the last subfield 
operation on image L. Successive subfield operations continue on each 
of the images until no reduction of the image is obtained in one com- 
plete iteration, and the skeleton of the image is gated out of the 
machine. When three Golay neighbor planes generators are utilized to 
obtain higher data processing rates, the serial procedure for genera- 
ting composite Golay neighbor planes is not required, and any type of 
index recognition circuit can be employed. 

Figure 6.6 is a schematic of the minimum hardware, modified pipe- 
line implementation of the skeletonizing algorithm. The timing dia- 
gram provided in Figure 6.7 shows that the cycle time of this machine 
is 144 gate delays. For a particular image, each iteration of the 
skeletonizing algorithm requires 432 gate delays. However, because of 
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Figure 6.6 Schematic for the minimum hardware, modified pipeline implementation of the 
skeletonizing algorithm. 














133 


the pipeline organization and simultaneous processing of three dif- 
ferent Images, the effective or average time for each iteration is 
only 144 gate delays. As a result, this machine can process 83 simple 
Images per minute compared to only 38 simple images per minute for the 
machines described In Chapter 5 which use the same Index recognition 
circuit. 

Thirty-four control signals are required by this machine. Their 
functions are outlined in Table 6.1. The hardware cost function for 
this implementation of the skeletonizing algorithm is 183A + 129B for 
a total of 312 components. A measure of the efficiency of this design 
is provided by the speed-power product of 120 watt-seconds compared to 
over 229 watt-seconds for the hardv/ired machines described in Chapter 5. 

Additional Modified Pipeline Implementations of 
the Skeletonizing Algorithm 

Figure 6.8 illustrates the general schematic for modified pipe- 
line implementations of the skeletonizing algorithm when three Go! ay 
neighbor planes generators are available. The index recognition cir- 
cuit can be of any desired type. Figure 6.9 is the timing diagram for 
the case of the shift register based index recognition circuit which 
v/as utilized in the minimum hardware, modified pipeline skeletonizing 
machine. Machine cycle time is reduced to 108 gate delays for a pro- 
cessing rate of 111 simple images per minute. The hardv/are cost func- 
tion 1s 193A + 157B for a total of 350 components. Elimination of the 
serial procedure for generating the composite Golay neighbor planes 
reduces the speed-power product by 14.5 percent to 102 watt-seconds. 
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TABLE 6.1 

MINIMUM HARDWARE, MODIFIED PIPELINE SKELETONIZING 
MACHINE CONTROL SIGNALS 


Control 
Si gnal 

Number of Conponents 
Controlled 

Control 

Function 

HP 

5 

Machine Power 

IP 

3 

Input Logic Power 

AE1 

3 

Layer Input Gate 

AC1 

2 

Latch A Slave 

AE2 

2 

Latch A Slave 

AC2 

2 

Latch A Master 

AQ 

2 

Latch A Output 

C'C 

1 

Latch C Output 

GP 

23 

GNPG Power 

SI 

6 

Mixer Subfield 1 

S2 

6 

Mixer Subfield 2 

S2* 

12 

Mixer Subfield 2 NOT 

S3 

6 

Mixer Subfield 3 

SP 

18 

Shift Register Power 

PL 

6 

Shift Register Input 

SR 

6 

Shift Slave into Master 

El 

6 

Shift Right to Next Latch 

Cl 

6 

Store Slave 

E2 

6 

Input to Master 

C2 

6 

Store Master 

IRP 

16 

Index Recognition Combinational 
Logic Power 

EID 

1 

Index Recognition Latch Input 

IDP 

2 

Index Recognition Latch Power 

FP 

17 

Golay Function and Multiplexer Power 

BEl 

4 

Latch B and C Slave Input 

BC1 

2 

Latch B and C Slave Feedback Path 

BE2 

2 

Latch B and C Master Input 

BC2 

2 

Latch B and C Master Feedback Path 

BQ 

1 

Latch B Output 

CQ 

2 

Latch C Output 

SMT 

1 

Subfield Multiplexer Output 

TS 

4 

Contractor Power 

CP 

1 

Continue Processing Control 

LD 

1 

Layer Output Control 

OD 

0 

Contractor Output Signal 



UTtH A 



Figure 6.8 Schematic for the modified pipeline implement 
with three Golay neighbor planes generators. 



on of the skeletonizing algorithm 








jre 6.9 Timing diagram for the modified pipeline skeletonizing 
machine with the shift register based index recognition circu 
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This Implementation of the skeletonizing machine provides approximately 
twice the data processing capability of the OR latch machine presented 
1n Chapter 5 while reducing the speed-power product by 56 percent. The 
cost of this improved performance is a 57 percent Increase In the num- 
ber of active components and a 76 percent Increase In the number of 
passive components required to build the skelctorizing machine. The 
functions of the 29 control signals required by this machine are out- 
lined In Table 6.2, 

For applications which require ultra high data processing rates, 
a space Iterative Index recognition circuit can be used in the skele- 
tonizing machine in Figure 6.8. A timing diagram for this Implemen- 
tation of the skeletonizing algorithm is Illustrated In Figure 6.10. 
Control signal functions are described in Table 6.3. The hardware cost 
of this machine Is 249A + 1898. A total of 438 components, or more 
than twice the number of components required by the OR latch skele- 
tonizing machine, are utilized in this ultrahigh speed design. Fig- 
ure 6.10 shows that the machine cycle time is only 34 gate delays 
which allows the skeletonizing machine to process 353 simple Images 
per minute. This is approximately nine times the processing speed of 
the OR latch machine described In Chapter 5. The speed-power product 
of this high speed, modified pipeline skeletonizing machine is 93 watt- 
seconds versus 229 watt-seconds for the OR latch machine. The charac- 
teristics of the three modified pipeline skeletonizing machines de- 
scribed In this chapter are summarized in Table 6.4. 

Extension of the modified pipeline organization described In this 
chapter to the realization of Golay transforms Involving four or seven 
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TABLE 6.2 

CONTROL SIGNALS FOR THE MODIFIED PIPELINE SKELETONIZING 
MACHINE WITH THREE GOLAY NEIGHBOR PLANES GENERATORS 


Control 

Signals 

Number of Components 
Controlled 

Control 

Function 

HP 

5 

Machine Power 

IP 

3 

Input Logic Power 

AEl 

3 

Layer Input Gate 

ACl 

2 

Latch A Slave 

AE2 

2 

Latch A Slave 

AC2 

2 

Latch A Master 

AQ 

2 

Latch A Output 

C'C 

1 

Latch C Output 

GP 

63 

GNPG Pov/er 

sp 

18 

Shift Register Power 

PL 

6 

Shift Register Input 

SR 

6 

Shift Slave into Master 

El 

6 

Shift Right to Next Latch 

Cl 

6 

Store Slave 

E2 

6 

Input to Master 

C2 

6 

Store Master 

IRP 

16 

Index Recognition Combinational 
Logic Power 

EID 

1 

Index Recognition Latch Input 

IDP 

2 

Index Recognition Latch Power 

FP 

17 

Golay Function and Multiplexer Power 

BEl 

4 

Latch B and C Slave Input 

BCl 

2 

Latch B and C Slave Feedback Path 

BE2 

2 

Latch B and C Master Input 

BC2 

2 

Latch B and C Master Feedback Path 

BQ 

3 

Latch B Output 

SMT 

1 

Subfield Multiplexer Output 

TS 

4 

Contractor Power 

CP 

V 

Continue Processing Control 

LO 

1 

Layer Output Control 

00 

0 

Contractor Output Signal 
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Figure 6.10 Timing diagram for the modified pipeline skeletonizing 
machine with the space iterative index recognition circuit. 
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TABLE 6.3 

CONTROL SIGNALS FOR THE MODIFIED PIPELINE 
SKELETONIZING MACHINE WITH THE SPACE ITERATIVE INDEX 
RECOGNITION CIRCUIT 


Control 

Signal 

Number of Components 
Controlled 

Control 
Functi on 

MP 

5 

Machine Power 

IP 

3 

Input Logic Power 

AEl 

3 

Layer Input Gate 

ACl 

2 

Latch A Slave 

AE2 

2 

Latch A Slave 

AQ 

4 

Latch A Output 

C'C 

1 

Latch C Output 

GP 

63 

GNPG Power 

IR 

131 

Index Recognition Circuit Power 

FP 

15 

Golay Function and Multiplexer Pov;er 

BE! 

4 

Latch B and C Slave Input 

BCl 

2 

Latch B and C Slave Feedback Path 

BE2 

2 

Latch B and C Master Input 

BC2 

2 

Latch B and C Master Feedback Path 

BQ 

3 

Latch B Output 

SMT 

1 

Subfield Multiplexer Output 

TS 

4 

Contractor Power 

CP 

1 

Continue Processing Control 

LO 

1 

Layer Output Control 

OD 

0 

Contractor Output Signal 
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TABLE 6.4 


PERFORT-l^tJCE CHARACTERISnCS OF THE r-IODIFIED PIPELITJE SKELETOraZIfJG MACHINES 


Machine 

Type 

Cost 

Function 

Humber of 

Control 

Signals 

Total 

Coirponent 

Count 

Gate Delays 
per Iteration 
(Time in 
Seconds} 

Data Rate 
Simple 
Images 
ner Minute 

Average 

Povrer 

Consumption 
in Watts 

Peak 

Pov?er 

Consumption 
in Watts 

Speed-Power 
Product in 
Watt-Seconds 

Miriiroi 

HARD’.v'ARE 

raOIFIED 

PIPELINE 

183A+I293 

34 

312 

144 

(0.72) 

83,33 

166.19 

255 

119.66 

MDDIFIEO 
PIPELINE 
miH SHIFT 
REGISTER 
moEx 

RECOG:aricrj 

193A+157B 

29 

350 

108 

(0.54) 

111.11 

189.50 

363 

102.33 

MODIFIED 
PIPELINE 
WITH SPACE 
ITERATIVE 

249A+I89B 

19 

438 

34 

(0.17) 

352.94 

545.12 

714 

92.67 
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subfields is straightforward. Each additional subfield requires two 
additional latches, one for storage of the input image and one for tem- 
porary storage of the additional partial result from the subfield oper- 
ation for that subfield. The three plane mixer circuits become four 
or seven plane mixers which consist of a mask for each subfield and OR 
gates to combine the multiple masked inputs into a single composite 
output image. Four or seven Golay neighbor planes generators are re- 
quired unless the sequential technique for generating the composite 
Golay neighbor plane-? is employed. The two plane mixer circuits re- 
quire only a mask change to insure that the correct subfields from the 
two input images are combined to form the composite output image. Since 
the Golay neighbor planes generator for the four subfields case is 
much simpler than for the three or seven subfields case, the case of 
four Golay neighbor planes generators does not increase the basic hard- 
ware cost of the Golay transform machine. However, the seven Golay 
neighbor planes generators for the seven subfields case represent a sub- 
stantial increase in hardv/are cost for the basic Golay transform machine 
organization. 

This chapter presented the development of modified pipeline re- 
alizations of the skeletonizing algorithm which are useful in appli- 
cations that require high data processing rates. Chapter 7 will de- 
scribe the design of a programmable tse computer architecture which 
is capable of performing Golay transforms with a relatively small num- 
ber of tse logic devices. 


U3 


CHAPTER 7 

A SPECIAL PURPOSE PROGRAfW\BLE TSE PROCESSOR 

A number of different architectures for hardv/ired tse logic 
skeletonizing machines have been presented in previous chapters. 

These machines have the advantage of providing relatively high data 
processing rates but do not offer the flexibility which could be 
obtained v/ith a programmable tse processor. This chapter presents a 
special purpose programmable tse computer v/hich can be used to perform 
numerous Golay transforms. A microcomputer- based, tse computer control 
unit is described and used to define a basic instruction set for the 
tse processor. Programs for performing the Golay transform skeleton- 
izing and swelling algorithms are illustrated. In addition, the use 
of advanced microprogramming techniques to define additional instruc- 
tions for the tse computer is demonstrated. 

A Programmable Tse Computer 

Figure 7.1 is a block diagram of a special purpose tse computer 
which is designed to perform Golay transform operations on images 
divided into three subfields. The machine consists of an arithmetic- 
logic unit (ALU), an accumulator latch (A), two. general purpose 
latches (B and C), an index recognition latch (I), an output latch 
(0), a contractor, an index recognition circuit, and a control unit. 
The ALU includes a latch which temporarily stores the result of an 
ALU operation when the accumulator or a general purpose latch is 
being used as both a source and a destination register for the current 
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operation. This prevents undesirable race conditions from developing 
in the tse processor. 

The accumulator and general purpose registers, B and C, serve 
as both source and destination registers. Latch 0 can only function 
as a destination register, and latch I can only function as a source 
register for ALU operations. In the ALU operations which require 
two operands, the accumulator is entered in the right side of the ALU 
as one operand, v/hile the output of latch B, C or I is entered in the 
left side of the ALU as the second operand. The ALU is capable of 
performing the AND, EXCLUSIVE-OR, and COMPLEMENT operations. Images 
can also be gated through the ALU to the input of any destination 
register. Since the registers are OR latches, the OR operation is 
normally performed by gating one operand through the ALU and ORing 
that image with the second operand which is stored in the destination 
register. 

Two independent mask generator circuits of the type illustrated 
in Figure 7.2 are included in the ALU. An image mask for any of the 
three possible subfields or any combination of those subfields can be 
created by controlling the states of the three tse read-only-memories. 
For example, if Ml, M2, and M are all active, the output image will 
be M3. The ALU is organized so that these mask tses can be ORed or 
ANDed with the ALU input images from the source registers. This pro- 
vides the capability of performing ALU operations on entire images or 
only on selected subfields of the images. The result of each ALU 
operation is tested by the contractor which detects all zero tses. 

In addition to the ALU, the tse computer utilizes an index 


Figure 7.2 


A three subfield mask generator circuit. 
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recognition netvrark consisting of a Golay neighbor planes generator 
and a comparison type index recognition circuit. This function 
could be implemented as a programmed ALU operation using individual 
logical and slide operations but would then require extremely long 
execution times. The hardwired comparison type index recognition 
circuit provides an effective trade off between the hardware cost 
and the speed of the Golay transform tse computer v;hile preserving 
the ability to recognize all fourteen possible indices. The 
accumulator is the source register for all index recognition operations. 
Latch I is the destination register. 

A schematic of the tse logic used in the Golay transform computer 
is illustrated in Figure 7.3. The hardware cost function for this 
machine is 97A+69B. No random access tse memory is included in the 
computer organization because of the high hardware cost of tse memory. 
Also, due to the flexibility of the tse logic ALU, external memory 
requirements should be minimal. In most applications, only a serial 
input image buffer memory would be required to synchronize the 
variable length Golay image processing operation to the incoming 
data. Additional general purpose registers could be added to the tse 
computer organization if they are needed for temporary storage 
of partial results from a complex transform operation. This would 
require less hardware than adding external memory to the tse computer. 

Tse Computer Control Unit 

Theoretically, a computer control unit should be able to 
produce the optimum control bit sequence for performing any operation 
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which is feasible v/ith the register and ALU organization of the con- 
trolled computer. In practice, this degree of flexibility can only be 
approached by utilizing a microprogrammed control unit which contains 
the control bit combinations in microinstructions that are read from a 
control memory. Groups of microin.'itructions form microprograms that 
control the execution of each macroinstruction. Thus, microprogrammed 
computers require a small computing section within the central proces- 
sing unit (CPU) to execute the microprograms. A block diagram of the 
organization of a typical microprogrammed control unit is shown in Fig- 
ure 7.4. The cycle time of a microprogrammed control unit must be sig- 
nificantly faster than the minimum cycle time of the main computer for 
the microprogrammed control unit to be efficient. In conventional com- 
puters this requirement places severe restrictions on the design of the 
control unit. The trade off betv/een the complexity of the control unit 
and the time required to decode and execute each instruction must be 
carefully considered in the design. Since the projected propagation 
delay of tse logic devices is several orders of magnitude longer than 
the propagation delay of standard bipolar and MOS logic, a complex con- 
ventional logic control unit can potentially provide efficient control 
of a tse logic ALU. 

A tse computer control unit that provides the benefits of micro- 
programning and permits the use of conventional semiconductor memory for 
tse computer program storage has been designed. The control unit is 
based on RCA's CUP 1802 COSMAC microprocessor [19]. Figure 7.5 is a de- 
tailed block diagram of the control unit. The 1802 microprocessor was 
chosen over other currently available devices because of several unique 
architectural features which enhance the input-output (I/O) and control 
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Figure 7,5 Block diagram of the tse computer control unit 
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capabilities of the device. These features include on-chip I/O and the 
use of multiple program counters. A brief summary of the architecture 
and instruction set of the 1802 is provided in Appendix C. 

The 1802 microprocessor controls the tse components by outputting 
control bytes and executing internal timing loops to account for the 
propagation delays of the tse logic devices. Flag input EFl monitors 
the output of the contractor to detect all zero tses at the output of 
the ALU. The EF2 flag is used for external input requests. These re- 
quests are acknowledged by setting bit three of output port five. In- 
put images are gated into latch A under control of the Q line. At the 
end of the execution cycle for major tse instructions^ EF3 is tested 
for an external request to halt tse processing. This feature can be 
used to single step the tse computer through the more complex tse in- 
structions. Conventional techniques [19] can be used to single step the 
1802 microprocessor through the simpler tse instructions and the micro- 
programs themselves. Table 7.1 summarizes the function of each major 
tse control signal. 


Tse Computer Instruction Set 

An instruction set consisting of all the standard CDP 1802 in- 
structions plus 26 generic tse instructions with over 1300 variations 
has been developed for the special purpose Golay transform tse com- 
puter. The 1802 instruction set is given in Table C.l, Appendix C. 
COSMAC instructions can be used alternately as macroinstructions or 
as microinstructions. Table 7.2 summarizes the basic tse instruction 
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TABLE 7.1 

TSE COMPUTER CONTROL SIGNAL FUNCTIONS 


Name 

Number of Bits 

Function 

EFl 

1 

Detects All Zero Tses At ALU Output 

EF2 

1 

External Input Request 

EF3 

1 

External Request to Halt Tse Processing 

Q 

1 

Controls External Input to Latch A 

Output Port 1 

8 

ALU Control 

Output Port 2 

8 

Source Register Output and ALU Control 

Output Port 3 

8 

Destination Register Input and 



Feedback Control 

Output Port 4 

8 

Index Recognition Circuit Control 

Output Port 5 

1 

Acknowledge Input Request 

Output Port 6 

1 

ALU Control 

Output Port 7 

2 

ALU Output Latch Control 
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TABLE 7.2 

TSE INSTRUCTIONS 


Instruction 


Mnemonic 


Operation 


REGISTER OPERATIONS 


MOVE REGISTER TO REGISTER 
{SEE TOUT) 

TMOV REGI .REG2 

{REG2)+REG1 

MOVE IMMEDIATE TO REGISTER 

TMVI REG.MASK 

(MASK)-vREG 

CLEAR REGISTER I 

TCLRI 

a>i 

LOGIC OPERATIONS 

AND REGISTER TO A 

TAND REGI. REGS. MASK 

[(MsK)+(REG 2)3'(A) vREGl 

AMD A TO REGISTER 

TANA REGl,REG2J'tASK 

[(MASK)+(A)].(REG2)^REG1 

AND ItWEDIATE TO REGISTER 

TAN I REGI. REG2. MASK 

(MASK)-(REG2)-i-REGl 

AND REMSTER to a 

TANDN REGI ,REG2, MASK 

[ {MskFOieiSTI' (a)->regi 

AND 5 TO REGISTER 

TANAN REG' .REG2.MASK 

[pASK).(A)]-(REG2hREG1 

OR REGISTER TO A 

TOR RE6*MASK 

E (MASK)* (REG) 3+(A)*A 

OR A TO REGISTER 

TORA REG.MASK 

[(MASK)*(A)]+(REG)-REG 

OR IMMEDIATE TO REGISTER 

TORI REG.MASK 

(MASK)+(REG)-*REG 

OR irE“6iTTER TO A 

TORN REG.MASK 

[{MASK)+(REG)3+(A)^A 

OR A TO REGISTER 

TORAN REG. MASK 

E(M/I5K)+(A}3+(REG)-*REG 

EXCLUSIVE-OR REGISTER TO A 

TXOR REGI ,REG2. MASK 

[(MASK) •(REG2)3®(A) -REGI 

EXCLUSIVE-OR A TO REGISTER 

TXRA REGI .REG2. MASK 

E(MASK)*{A)3©(REG2)>REG1 

EXCLUSIVE-OR IMMEDIATE TO REGISTER 
COMPLEMENT REGISTER 

TXRI REG1.REG2.MASK\ 
TCHR REGI .REG2.MASK / 

{MASi0e(REG2)->REGl 
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TABLE 7.2 (contlruied) 


Instruction 

Mnemonic 

Operation 

INDEX RECOGNITION 

IDENTIFY BASIS POINTS WITH 
A SURROUND OF INDEX X 

TIDA X 

Where ID is a tse with Ts in 
positions which correspond 
to basis points of A which 
have a surround of index X. 

COMPARE OPERATIONS 

CONTRACT REGISTER 

TCNT REG.HASK 

(MASK)*{REG);IF=0, 0+RF 
IFj*0, URF 

TEST REGISTER 

TTEST REG,MASK 

(FiSsl()+(RE6)5lF=0, OtRF 
IFjfQ, 1-^RF 

COMPARE REGISTER 

TCMP REG.MASK 

[(MASK) • (REG) M (MASK) '(A)]! 

IF RESULT=0, 0+RF 
IF RESULTj'O, 1+RF 

COMPARE IMMEDIATE 

TCPI REG^MASK 

(MASK)®(REG); IF(REG)=(MASK), 0*RF 
IF(REG)?f{MASK), WRF 

TSE BRANCH OPERATIONS 

TSE SHORT BRANCH ON ZERO 

TBZ AOR 

IF RF=0,H(R(P))^R(P).0 
ELSE R(P)+1 

TSE SHORT BRANCH ON NO ZERO 

TBNZ AOR 

IF RFj»O.M(R(P))-R(P).0 
ELSE R(P)+1 

TSE LONG BRANCH ON ZERO 

TLBZ AOR 

IF RF=0,M(R(P))-*R(P).1 

M(R(P)+1)-*R(P).0 
ELSE R(P)+2 

TSE LONG BRANCH ON NO ZERO 

TLBNZ ADR 

IF RF=1,H(R(P))>R(P),1 

M(R(P)+1}-»R(P).0 
ELSE R(P)+2 


INPUT/OUTPUT OPERATIONS 


INPUT TSE 

OUTPUT m 
(SEE TMOV) 


TIN 


INPUT TSE-»A 
(REG)^O 


I 


TOUT REG 
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set v/hich v/as created by using the 1802 instructions as microinstruc- 
tions. The dummy arguments REG, REG1, REG2, and fiASK should be 
replaced by the appropriate arguments from Table 7.3 when programs 
are written using the COSMAC or tse instructions. 

Tse instructions are divided into six basic groups consisting 
of register operations, logic operations, index recognition operations, 
compare operations, branch operations, and input-output operations. 

The register operations facilitate movement of tse data within the 
computer. As is the case in all tse instructions, register 0 cannot 
be specified as a source register, and register I cannot be specified 
as a destination register. Logic instructions provide the operations 
which are necessary to perform Golay functions. In general, register 
A cannot be specified as REG2 v^hen register A is an implicit operand 
in the logic operations. The contents of any source register can be 
tested using the compare operations. These instructions are typically 
used to determine whether or not an image was altered by the last 
iteration of a Golay transform. The branch instructions provide 
a method for testing the results of a tse operation and for performing 
conditional operations. Both short branches, which are limited to 
the current memory page, and long branches, which can specify any 
memory location, are included in the instruction set. Note that the 
tse branch instructions depend on the contents of R{F).0 V'^hich is set 
to one or zero during each compare operation. The standard 1802 
instructions provide additional branching capabilites. For example, 
the 82 and BN2 instructions are usei to test for external input 
requests. 


TABLE 7.3 

REGISTER AND MASK CONSTANT DEFINITIONS 
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Name 

Symbol 

Decimal Value 

COSMAC REGISTERS 

REGISTER 0 

RO 

0 

REGISTER 1 

R1 

1 

REGISTER 2 

R2 

2 

REGISTER 3 

R3 

3 

REGISTER 4 

R4 

4 

REGISTER 5 

R5 

5 

REGISTER 6 

R6 

■ 6 

REGISTER 7 

R7 

7 

REGISTER 8 

R8 

8 

REGISTER 9 

R9 

9 

REGISTER 10 

RA 

10 

REGISTER n 

RB 

11 

REGISTER 12 

RC 

12 

REGISTER 13 

RD 

13 

REGISTER 14 

RE 

14 

REGISTER 15 

RF 

15 

TSE REGISTERS 

REGISTER A 

A 

225 

REGISTER B 

B 

210 

REGISTER C 

C 

180 

REGISTER 0 

0 

120 

REGISTER I 

I 

8 

TSE MASKS 

ALL ZERO MASK 

MO 

0 

ALL ONE MASK 

M 

12 

SUBFIELD ONE MASK 

Ml 

9 

SUBFIELD TWO MASK 

M2 

10 

SUBFIELD THREE MASK 

M3 

15 

SUBFIELDS ONE AND 

Ml 2 

11 

TWO MASK 



SUBaELDS ONE AND 

Ml 3 

14 

TH.IEE MASK 



SUBFIELDS Tl‘10 AND 

M23 

13 

THREE MASK 
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Micro pro gram Control of Tse Operations 

When an application program is assembled for the tse computer, 
each tse instruction generates a multiple byte operational code which 
is actually a group of 1802 instructions and data bytes. This infor- 
mation is used to produce the control signal sequence required to 
perform the specified tse operation. Some tse instructions, such as 
TCLRI and the tse branch instructions, require less than six bytes of 
microcode to control their execution. The microcode expansions of 
these instructions are used directly as their operational codes. 

Thus, these instructions are self-contained in the sense that they 
do not require a separate microprogram to control their execution. 

The remaining tse instructions require somewhat longer control programs 
which are executed as called subroutines. The subroutine call is 
performed by an 1802 set P instruction v/hich is used as the first 
byte of the tse instruction operational code. Control signal sequences 
are specified by the remaining bytes of the operational code. Control 
microprograms always return with a SEP R3 instruction since all tse 
computer applications programs will use R(3) as their program counter. 

Figure 7.6 is a flow chart for the general ALU operations control 
program which is listed in Figure D.l, Appendix D. This microprogram 
controls the execution of all the tse register and logic operations 
except TCLRI. These instructions require a six byte operational code. 
The first byte is a SEP R9 instruction which calls the general ALU 
operations microprogram. The remaining five bytes contain the ALU mask, 
tse register output, port six, tse register input number one, and tse 
register input number two control bytes which are output by the ALU 


’■d 


1 



OUTPUT MASK COtiTROL BYTE TO PORT 1 
' OUTPUT REGISTER OUTPUT CONTROL BYTE TO PORT 2 
AND OUTPUT THE PORT 6 CONTROL BYTE. 

ALL FROM THE CP CODE AT H(R(3)) 


DELAY 9 GATE BELAYS 


OUTPUT 1 TO PORT 7 TO TURN THE 
ALU OUTPUT AND GATE OFF 


1 


1 

r 


DELAY 2 GATE DELAYS 


1 




OUTPUT 0 TO PORTS 1, 2 AND G 


1 


OUTPUT TSE flEGISTRn IfiPUT COIiTROL BYTE UO 
FROM THE OP CODE AT M(R{3)) TO PORT 


io7i7 

3 / 


DEIAY 2 GATE DELAYS 




OUTPUT TSE REGISTER INPUT CONTROL BYTE NO. 2 
FROM THE OP CODE AT H{R(3)) TO PORT 3 


DELAY 2 GATE DELAYS 


I 


OUTPUT nilOOOO TO PORT 3 TO TURN ALL TSE 
DESTINATION LATCH INPUTS OFF WITH ALL FEEDBACK 
PATHS OH. OUTPUT 0 TO PORT 7 TO TURN THE ALU OUTPUTl 
LATCH OFF 



YES / SET BIT 3 OF 

PORT 5 TO / 

ACKNOHLEDGE REOUEST/ 


BRANCH TO INPUT 
SERVICE ROUTINE 


ducbility of thf 
^ . PAGE IS POOE 


Figure 7.6 A floiv chart for the general ALU operations control program 
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operations microprogram to control execution of the specified tse 
Instruction. 

The general ALU operations control program utilizes the versatile 
I/O capability of the 1802 to minimize the complexity of the control 
microprogram. Control bytes which vary from instruction to instruc- 
tion within the tse register and logic Instruction classes are output 
directly from the application program memory space addressed by R(3). 

This eliminates the need to specifically decode the individual tse 
register and logic Instructions before initiating their execution. 
Constar.t control bytes are output as immediate data from the control 
microprogram to minimize the length of the tse Instruction operational 
codes. The 1802 microprocessor is particularly efficient at performing 
these tasks because the output data pointer, R(X), is automatically 
incremented during each output operation, and the register which is 
assigned as the data pointer can be changed fay a single set X instruc- 
tion. 

Time delays are included in the microprogram to account for the 
relatively long propagation delay of the tse logic devices. The 
long delay subroutine MLDLY listed In Figure D.2, Appendix D, is called 
by the ALU operations microprogram to create the time delays. A 
standard subroutine call and return technique [19] is employed, and 
two data bytes are passed to MLDLY to specify the length of the delay. 

The 1802 executes most COSMAC instructions in two machine cycles which 
consist of eight machine states each. To simplify control signal timing, 
a 3.2MHz clock frequency was chosen for the 1802 microprocessor. This 
frequency permits the 1802 and associated components to be operated 
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from a five volt power supply and provides exactly 2000 machine states 
in five milliseconds. Thus, 1000 two-cycle 1802 instructions can be 
executed within the propagation delay of a tse logic gate. 

Figure 7,7 is a timing diagram for the tse register and logic 
instructions which execute under control of the general ALU operations 
microprogram. The execution time for these instructions is 19 gate 
delays. Flags EFl, EF2, and EF3 are tested during the execution of 
these instructions. At the end of the tse instruction execution cycle, 
R(F).0 will contain ztro if the image at the output of the ALU v/as an all 
zero tse. Otherwise, R(F),0 vnll be set equal to one. Note that the 
contents of R(F).0 is not necessarily indicative of the result of a 
tse OR instruction since the final OR operation is normally performed 
at the destination register. 

A listing of the tse c^'niputer compare operations control program 
is provided in Figure D.3, Appendix D, and a flow chart for the micro- 
program is shov/i. in Firuva 7.0. The tse compare instructions have 
four byte op'raf' .;al codes which consist of a SEP RC instruction 
and three control bytes. Only three control bytes are required 
because there is no destination register. With this single exception, 
the tse compare operations microprogram performs essentially the same 
function as the general ALU operations microprogram. Tse c 'r'.re 
instructions execute in 15 qate delays (Figure 7.P"'. 

One of the most complex tse computer opera liens la pcrfcrr.'od ty 
the index recognition instruction. The index recegniiion cCTitrol 
microprogram is listed in Figure D.4, Appendix D. Figure 7.10 is a 
flow chart for this mi cropronram. The index recognition instruction 
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Figure 7.8 


c 


START 




CUTPUT MASK COHTROL BYTE TO PORT 1 
OUTPUT REGISTER OUTPUT CONTROL BYTE TO PORT 2 
ANO OUTPUT THE PORT 6 CONTROL BYTE 
ALL FROM THE OP CODE AT M(R(3)) 


DELAY 3 GATE DELAYS 


' OUTPUT 1 TO PORT 7 TO TURN 
THE ALU OUTPUT GATE OFF 


PUT 0 INTO R(F),0 





\NO 

PUT 1 INTO R{F).0 

w — 





OUTPUT 0 TO PORTS 1. 2, 6, AND 7 TO 
TURN THE TSE COMPOffENTS OFF 



SET on 3 OF PORT 5 
TO ACKNOULEOGE 
REQUEST 


QRANCH TO INPUT 
SERVICE ROUTINE 


A flow chart for the 


tse compare operations control program. 













Figure 7.10 A flow chart for the index recognition control program 
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operational code varies between three and eight bytes in length 
depending upon the weight of the index. Each index recognition instruc- 
tion begins with a SEP RA. The second byte of the operational code 
specifies the weight of the index, and the remaining bytes provide 
the control bits which are output to drive the index recognition masks 
and enable register I. The six least significant bits of the control 
bytes correspond to the six Golay neighbors of a basis point but have 
the complementary logic state. 

Index recognition images are ORed into register I so that multiple 
indices can be easily recognized. The index recognition instruction 
has a variable execution time consisting of nine gate delays for each 
orientation of the specified surround. A timing diagram for recognizing 
an index with a weight of tvi/o is shovm in Figure 7.11. Most indices 
have a v^eight of six and can be recognized in 54 gate delays. 

A special TCLRI instruction is provided for clearing the contents 
of the index recognition register before each set of index recognition 
operations. TCLRI is a self-contained instruction which turns the 
index recognition latch off by executing an 1802 OUT 4 instruction with 
an inriediate data byte of zero. Two unit gate delays are inserted at 
the end of the TCLRI instruction by calling LDLY (Figure D.5, Appendix 
D). This insures that the index recognition latch will clear before 
the next index recognition operation. The TCLRI operational code is 
five bytes long. 

The tse input instruction has a one byte operational code, SEP RD, 
Figure 0.6, Appendix D, is a listing of the microprogram which controls 
the execution of the tse input instruction and Figure 7.12 is a flow 




f START J 

/ SET f) TO ENABLE A TSE INPUT / 

i 

OUTPUT in 00000 TO PORT 3 TO 
TURN THE REGISTER A FEEDBACK 
PATH REFORHATTER OFF 


DELAV 3 GATE DELAYS 





Figure 7.12 A flow chart for the tse input control program 
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chart for the microprogram. The Q line is set to enable an external 
tse input to latch A, and the feedback path reformatter is turned off 
to clear the register (Figure 7.13). After three gate delays, the 
feedback path is re-enabled, and following two additional gate delays, 

Q is reset to disable the external tse input. Execution time for the 
tse input instruction is five gate delays. Both the EF2 and the EF? 
flags are tested. The tse output instruction, TOUT, is a special case 
of the THOV instruction and is controlled by the general ALU operations 
microprogram. 

Tse branch instructions are self-contained operations which per- 
form an 1802 GLO RF instruction followed by the appropriate 1802 branch 
instruction (BZ, BNZ, LBZ, or LBNZ). The short tse branches require 
a three byte operational code, and the long tse branches require a 
four byte operational code. Tse branch instructions execute in essen- 
tially zero gate delays. 

Table 7.4 summarizes the important characteristics of the basic 
tse instruction set which has been developed for the Golay transform 
tse computer. The efficiency of the 1802 microprogram control technique 
is indicated by the fact that the control routines listed in Appendix D 
require only 238 bytes of memory. 

A Cross-Assembler for the Tse Computer 

A cross-assembler has been written to aid in the development of 
tse computer applications programs. The tse computer cross-assembler 
is a macro library which can be used in conjunction with Digital Equip- 
ment Corporation’s RT-11 MACRO assembler [20] and a PDP 11/40 minicom- 
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Q / 

P11 -P18 

P21-P28 

P31-P34 

P35-P38 \ 

P41-P48 

P61 

P71 

P72 

SAMPLE 
FLAG'S 

TIME I 

0 5 



Figure 7.13 A timing diagram for the tse 
input operation. 



TABLE 7.4 


TSE INSTRUCTION CHARACTERISTICS 


Instruction 

Class 

Op Code 

Length in Bytes 

Execution 
Time in Gate 
Del ays 

Control 
Microprogram 
Program Counter 

Register (except TCLRI) 

6 

19 

R9 

TCLRI 

5 

2 

- 

Logic 

6 

19 

R9 

Index Recognition 

3-8 

9-54 

RA 

Compare 

4 

15 

RC 

Branch 

3-4 

0 

- 

Input 

1 

5 

RD 

Output 

6 

19 

R9 
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puter to assemble both microprograms and applications programs for 
the tse computer. The RT-11 MACRO assembler features conditional and 
macroassembly capabilities [20] v/hich are utilized in assembling pro- 
grams for the tse computer. A brief summary of the RT-11 MACRO assem- 
bler commands is provided in Appendix E. Interested readers can refer 
to Korn [21] for a general discussion of macros and conditional assem- 
bly. 

The tse computer macro library defines all of the COSMAC and tse 
instructions as well as the pseudo and delay instructions listed in 
Table 7.5. A complete listing of the tse computer macro library is too 
long to be included here. However, a representative subset of the macro 
library is listed in Appendix F. Each macro definition specifies the 
symbolic name of an instruction. Operational code bytes for the instruc- 
tion can be calculated by the macro assembler if they cannot be specified 
as constants. Both logical operations and conditional tests are utilized 
in computing the calculated operational codes. Since the POP 11/40 
utilizes 16 bit words, the BYTE operation is used to truncate the words 
to eight bits. Many of the macro definitions include a call to another 
macro which v/ill test for illegal conditions. For example, the short 
branch instruction macros call $$$PAG, which checks for an illegal 
branch across page boundaries. 

New instructions can be added to the repertoire of the tse computer 
by defining additional macros for them. As an example, consider a tse 
mix operation that combines subfields from the accumulator and another 
source register to form the resultant image. This operation is given 
the symbolic name TMIX and assigned three arguments. The first argument 


TABLE 7.5 

SPECIAL TSE COMPUTER INSTRUCTIONS 
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Instruction 

Mnemoni c 

Operation 

PSEUDO OPERATIONS 



ORIGIN 

ORG 

Specifies program starting address 

END 

END 

Harks the end of a source program 

DATA BYTE 

DB 

Places a data byte in the object ftle 

DATA WORD 

DW 

Places a data v/ord (tv/o bytes) in the 
object file 

DATA STORAGE 

DS 

Reserves a set of memory locations 
for data storage 


MACRO OPERATIONS 


SUBROUTINE CALL 

CALL 

Sets P to 4 to initiate the standard 

RETURN FROM 

RSR 

subroutine call procedure 
Sets P to 5 to initiate the standard 

SUBROUTINE 


subroutine return procedure 

LOAD IMMEDIATE, 

LOAD 

Loads the given 16 bit constant into 

REGISTER 


the specified COSMAC register 

LONG DELAY, 

LDLY 

Delay 8N+22 cycles 

RETURN TO R (3) 

LONG DELAY, 

MLDLY 

Delay 8N+30 cycles 

RETURN TO CALLING 
PROGRAM COUNTER 

DELAY 3 

DLY3 

Delay 3 cycles 

DELAY 4 

DLY4 

Delay 4 cycles 

DELAY 6 

DLY6 

Delay 6 cycles 

DELAY 9 

DLY9 

Delay 9 cycles 

DELAY 10 

DLYIO 

Delay 10 cycles 

DELAY 12 

DLY12 

Delay 12 cycles 

DELAY 15 

DLY15 

Delay 15 cycles 

DELAY 18 

DLY18 

Delay 18 cycles 


rjODUCIBlLlTY Oi‘ THl 
;:dGlNAL PAGE IS POOIi 
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is REGl which specifies the destination register. The second argument, 
REG2 > identifies the source register that is to be mixed with the accu- 
mulator. A third argument, MASK, is required to specify the subfields 
of REG2 that should appear in the resultant image. The switching 
expression for this operation is 

[ (MASK) • ( REG2 ) ]+[ (MASK) • (A) ]-j-REG1 . (7 ) 

Figure 7.14 provides a listing of the macro definition for the TMIX 
instruction. 

The TMIX instruction is v/ritten to execute with R(3) as the pro- 
gram counter. Mask, register output, and port six control bytes are 
output as immediate data to gate the specified subfields for REG2 into 
the ALU output latch. A long delay subroutine, LDLY, (Figure D.5, 
Appendix D) is then called to insert a delay which allows the image 
to propagate through the ALU. After nine gate delays the ALU output 
image v/ill consist of the specified subfields of REG2 and zeros in the 
unspecified subfields. This image is temporarily stored in the ALU 
output latch, instead of being sent to the destination register. 

To complete the TMIX operation, the contents of register A 
should be ANDed with the complement of the subfield mask specified in 
the TMIX macro call and ORed with the current contents of the ALU out- 
put latch. Then the resultant image should be stored in the desti- 
nation register, REGl. These operations can be executed under control 
of the general ALU operations microprogram (Figure D.l, Appendix D). 

A SEP R9 instruction byte is included in the TMIX instruction opera- 
tional code to initiate the microprogram call. Conditional assembly 




. S'lACRO TMXX REG1/REG2, MASK 


. WLIST SRC 

. BYTE -^0141 j 

. BYTE MASK } 

, BYTE •'^0142 } 

. BYTE <''0017&REG2> j 

, BYTE •''0146 } 

. BYTE "'BOOOOOOOl } 

. BYTE ^0327 } 

BW 430/ ; 

, BYTE "'0147 J 

, BYTE •''BOOOOOOll } 

. BYTE -"0327 : 

BW 1745 ) 

. BYTE "'0147 I 

. BYTE •'‘BOOOOOOOl J 

. BYTE "'0327 j 

BW 0367 i 


'0331 


I IF EQ, 
IIP E0, 
I IF E0, 
I IF EC. 
IIP EQ. 
IIP EQ. 
IIP EQ, 
IIP EQ. 


MASK, BYTE 

MA3K-"'014. 

MASK-"'0011, 

MASK“"'0012. 

riASK-"'0017. 

MASK-"'0013, 

MASK" ''001 6, 

MASK'-'‘001S, 


. BYTE "'BlOOCOOOl 
. BYTE "'BOOOOOOOl 
. BYTE REGl 
.BYTE C' 0360! REG i> 
. LIST SRC 
. ENDM TMIX 


"'0300 

BYTE "'0000 
BYTE "'0320 
BYTE "'0340 
BYTE ^0260 
BYTE "'0360 
BYTE ■''•0240 
BYTE "*0220 


MIX REGISTER WITH A . 

0U1' 1 

MASK CONTROL BYTE 
OUT 2 

REGISTER OUTPUT CONTROL BYTE 
OUT 6 

PORT SIX CONTROL BYTE. P61 ON 
LBLY 2247. 

DELAY NINE GATE DELAYS 
OUT 7 

TURN ALU OUTPUT LATCH ON 
LBLY 997. 

DELAY FOUR GATE DELAYS 
OUT 7 

TURN AND GATE AT OUTPUT OF ALU OFF 
LDLY 247. 

DELAY ONE GATE DELAY 
SEP R9 

CONDITIONAL MASK 
CONTROL BYTES 


REGISTER OUTPUT CONTROL BYTE 
PORT SIX CONTROL BYTE. P61 ON 
REGISTER INPUT CONTROL BYTE NUMBER 1 
REGISTER INPUT CONTROL BYTE NUMBER 2 


Figure 7.14 A macro definition for the TMIX instruction. 
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statements are used to establish the correct mask byte. The register 
output, port six, and register input bytes required by the general 
ALU operations microprogram are also included in the TMIX instruction 
operational code. 

A timing diagram which illustrates the control signals for the 

TMIX instruction is shown in Figure 7.15, Thirty- four gate delays 

are required to execute the TMIX instruction. This operation could 

be performed by a sequence of basic tse instructions. For example, 

TANI C,B,M1 
TORA C,M23 

is equivalent to 

TMIX C,B,M1 . 

The advantage of using the microprogrammed TMIX instruction is a 
reduction in execution time. 

Application Program Examples 

The tse computer can perform a variety of Golay transforms. Fig- 
ures G.l and 6.2, Appendix G, are listings of tse computer programs 
for performing the Golay transfom skeletonizing and swelling algorithms, 
respectively. The skeletonizing program is 129 bytes long and requires 
607 unit gate delays to process a simple image. The sv/elling program 
is slightly more complex but requires only 147 bytes of program storage. 
One iteration of the swelling algorithm can be performed in 664 unit 
gate delays. Neither of these programs utilize tse register B. There- 
fore, register B can be used for temporary storage of another image. 

The simplicity of these programs is indicative of the power of the tse 
computer instruction set and CPU organization. 




Tse Computer Performance Evaluation 


Classically, the evaluation of computer performance has proven 
to be a difficult problem which is highly dependent upon the task 
that is assigned to the computer. At this time, the performance of 
the special purpose Go! ay transform tse computer can best be evaluated 
by considering the skeletonizing algorithm. Table 7.6 summarizes the 
performance characteristics of the tse computer in the skeletonizing 
machine application. 

The low hardware cost and relatively low data rate of this machine 
are primarily due to the use of a comparison type index recognition 
circuit. The ability of the tse computer to perform a variety of 
Golay transforms under program control is the primary advantage of 
the machine. 


TABLE 7.6 


PERFORMANCE OF THE TSE COMPUTER AS A SKELETONIZING MACHINE 


Cost 

Function 

Number of 
Control 
Si gnal s 

Total 

Component 

Count 

Gate Delays 
per Iteration 
(Time in 
Seconds) 

Data Rate 
Simple 
Images 
per Minute 

Average 

Power 

Consumption 
in Watts 

Peak 

Power 

Consumption 
in Watts 

Speed- 
Power 
Product in 
Watt- 
Seconds 

97A+69B 

35 

166 

607 

19.77 

204.96 

222 

622.07 


(3.04) 
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CHAPTER 8 
CONCLUSION 

The tse logic devices proposed by Schaefer and Strong [1] have 
been used to develop hardwired, pipelined, and programmable architec- 
tures for Golay transform processing machines. These machines illus- 
trate that tse logic circuits can perform useful image processing 
algorithms which have not been optimized for tse logic processing. 

The key step toward performing Golay transforms with tse logic was 
the development of the Golay neighbor planes generator circuit. This 
circuit facilitates tse logic implementations of the index recognition 
operation. Because the hardware cost and processing rate of a Golay 
transform machine art highly dependent upon the index recognition 
circuit, several alternate realizations of the index recognition opera 
tion were developed. In addition, several new tse logic devices were 
proposed, and a set of performance evaluation parameters v/as developed 
to aid in the design of tse logic circuits. Techniques were also de- 
veloped for controlling tse logic circuits with conventional logic con- 
trol units constructed from a microcomputer or from programmable logic 
arrays. 


A Critique of Tse Logic 

The major advantages of tse logic devices are their ability to 
operate on a large number of data points simultaneously and the high 
data processing rates which can potentially be achieved due to this 
parallelism. The fan-in and fan-out limitations of the devices are 
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drawbacks since they tend to increase both the hardware cost and 
propagation delay of a tse logic circuit. Perhaps the most serious 
disadvantage of current tse logic circuits is the large number of 
device-to-device interconnections v/hich are required. In conventional 
integrated logic circuits, manufacturing costs and failure rates 
both increase substantially with any increase in the number of external 
connections required by the circuit. Although improved interconnection 
techniques vnll have to be developed for tse logic circuits, there is 
currently no evidence to support a theory that the cost and reliability 
of tse logic circuits will not be heavily dependent upon the device- 
to-device interfaces. In fact, since the same basic integrated circuit 
technology is used to fabricate both conventional logic and active 
tse logic devices, the cost and failure rate characteristics of the 
conventional devices can be expected to prevail. 

Suggested Directions for Future Research 

One method of reducing the number of device-to-device interfaces 
in a tse logic circuit is to increase the functional complexity of the 
individual tse logic devices. This technique has been used successfully 
in conventional logic and should be investigated for possible use in 
tse logic. A projection of the integrated circuit complexity, which 
should be realizable by a target date such as 1985, would aid tse device 
designers in the task of partitioning complex logic functions into 
individual circuits. As an example of the usefulness of this approach, 
consider the potential advantage of an integrated tse latch or read- 
write memory. Three active and two passive tse devices are currently 
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required to construct the simplest tse latch. An integrated tse 
latch would reduce the number of device- to- device interfaces from 
eight to two. In addition, the integrated tse latch could potentially 
reduce the propagation delay, power consumption, and size of the tse 
latch. 

Although the electro-optical family of tse logic devices was 
used for all the design examples in this dissertation, the basic 
designs and design principles described here are not dependent upon 
the signal transfer technique. Because of the low efficiency which 
is characteristic of electro-optical interfaces, additional signal 
transfer techniques should be investigated. 

An alternate technique for achieving high speed parallel processing 
by using arrays of microprocessors or programmable logic circuits 
should also be investigated. The possible advantages of this technique 
include a reduction in the number of different integrated circuits 
v^hich must be fabricated and reduced device-to-device interface 
complexity. The possibility of a reduction in interface complexity 
is projected because of the bus oriented structure of microprocessors. 

As a long term project, research should be conducted on the use 
of two-dimensional tse logic control units for tse circuits. If the 
full potential of a completely two-dimensional computer can be realized, 
a revolutionary advancement over the large scale computing capabilities 
of today's computers could be expected. 
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APPENDIX A 

SCHEMATIC SYMBOLS FOR TSE LOGIC DEVICES 

Table A.l lists the tse logic devices and their schematic 
symbols. To simplify schematic drawings, three different symbols 
are used for the interleaver. Note that the input and output surfaces 
are reversed when the interleaver is used as a combiner rather than 
as a duplicator. Also, note that the slide gates move an image more 
than one matrix position, a number should be included within the slide 
gate symbol to indicate the extent of the slide. 


SCHEMATIC 


Devi ce 

ACTIVE DEVICES 


AND 

OR 

NEGATE 

EXCLUSIVE-OR 

REFORMAT 

TOTAL SPILLER 

CONTRACTOR 

ROM 

PASSIVE DEVICES 
INTERLEAVER AS A COMBINER 


187 


TABLE A.l 

.S FOR TSE LOGIC DEVICES 
SiTnbol 
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TABLE A.l 
(Continued) 


Devi ce 


Symbol 


INTERLEAVER AS A DUPLICATOR 



SLIDE UP 
. SLIDE DOWN 
SLIDE RIGHT 
SLIDE LEFT 


QEI 

CH] 

nn 


- OF THE 

PAGE IS POOH 


TABLE A,1 
(Continued) 


Device 


EXCHANGE 


IMAGE BUS-LONG 


IMAGE BUS-SHORT 
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Symbol 
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APPENDIX B 
TSE MASK PATTERNS 

Table B.1 defines the mask patterns for the tse read-only-memories 
and programmed output active devices. The ROMs are programmed to 
produce logic one outputs at the array positions specified as active 
and logic zero outputs elsewhere. In the case of programmed output 
active devices, the array outputs which are specified as active are 
normal outputs which can produce a logic one or a logic zero state 
that is a function of the inputs to the active device. The remaining 
array outputs of the programmed output active devices are disabled 
so that they always produce a logic zero output. 

T 

The notation Gg is used to specify points in the S Golay subfields 
of the array where the array is divided into three, four, or seven 
subfields as specified by the type superscript, T. For example, 

2 is the set of all points in the second and third subfields of 
an image where the image has been partitioned into three Golay subfields. 


TABLE 


TSE MASK 


Name 

Symbol 

ALL LOGIC ONE TSE 

M 

ODD TSE MASK 

MO 

EVEN TSE MASK 

ME 

ALL LOGIC ZERO TSE 

MO 

GOLAY SUBFIELD ONE MASK 

Ml 

GOLAY SUBFIELD TWO MASK 

M2 

GOLAY SUBFIELD THREE MASK 

M3 

GOLAY SUBFIELDS ONE 
. AND TWO MASK 

Ml 2 

GOLAY SUBFIELDS ONE 
. AND THREE MASK 

Ml 3 

GOLAY SUBFIELDS TWO 
AND THREE MASK 

M23 
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APPENDIX C 

THE CDP1802 MICROPROCESSOR 

The CDP1802 COSMAC microprocessor [19] is a byte-oriented 
central processing unit constructed as a complementary-symmetry 
HOS integrated circuit. A block diagram of the internal structure 
of the 1802 (Figure C.l) shoves that the COSMAC architecture is based 
on an array of 16 general-purpose 16- bit scratch-pad registers. These 
general-purpose registers are connected to a common bus and can be 
selected by the four-bit N, P, and X registers to perform specific 
tasks. The scratch-pad registers can be used as program counters, 
data address pointers, general-purpose counters, and temporary 
data storage locations. High and lov/ bytes of the scratch-pad 
registers can also be gated betv/een the register array and the eight- 
bit D register which functions as an accumulator. 

One of the outstanding features of the COSMAC architecture is that 
any of the scratch-pad registers can be used as the program counter. 
This permits a very fast and efficient subroutine call which is 
performed by a one-byte set P instruction that simply specifies a 
new program counter. The tse computer control subroutines are called 
by set P instructions. 

Another important feature of the 1802 is a flexible input-output 
structure. Four EF flags which can be used as one bit inputs are 
included in the CPU. These flags can be tested by 1802 branch instruc- 
tions and are used to check the status of the tse computer. The Q flag 
functions as a single bit output which can be set, reset, and tested by 
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Figure C.l Internal structure of the CDP1802 
COSMAC microprocessor. 

(Courtesy of Solid State Division, RCA Corporation) 
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the 1802 CPU. The tse computer input data path is controlled by the 
Q line. In addition to this on-chip I/O, the 1802 includes a set 
of memory-oriented I/O instructions which are used to provide control 
signals to the tse computer. 

A summary of the COSMAC 1802 microprocessor instruction set is 
given in Table C.l. The notation R(W) indicates the register designated 
by W v/here W is N, X, or P. When the low order or high order bytes of 
R(W) are referenced individually, the notation R(W).0 refers to the low 
order byte while R(W).l denotes the high order byte. As an example 
of the operation notation, the symbols 

D->M(R(X)); R(X)-1 

mean that the contents of register D are stored in the memory location 
pointed to by the register selected by the current contents of X and 
that the register specified by X is decremented by one. 

Several features of the 1802 instruction set should be noted. 

First, all COSfiAC instructions except the long branch, long skip, 
and NOP instructions execute in two machine cycles consisting of eight 
states each. The long branch, long skip, and NOP instructions execute 
in three machine cycles. This feature simplifies the realization of 
precisely timed control signals. Second, most of the 1302 instructions 
require only a simple one-byte operational code which conserves memory. 
Third, all the logic and arithmetic operations utilize the contents of 
D and the contents of memory as operands. Data stored in scratch-pad 
registers cannot be operated on by these instructions. Finally, note 
that the memory address required by the I/O instructions is obtained from 


TABLE C,1 


CDP1802 MICROPROCESSOR INSTRUCTION SET [19] 


Instruction Mnemonic Operation 


MEMORY REFERENCE 


LOAD VIA N 

LDM 

H(R(N)hD; FOR N NOT 0 

LOAD ADVANCE 

LDA 

M(R(N))->0; R(N)+1 

LOAD VIA X 

LDX 

M(R(X))->D 

LOAD VIA X AND ADVANCE 

LDXA 

M(R(X)H; R(X)+1 

LOAD IMMEDIATE 

LDI 

M(R{P))->0; R(P)+1 

STORE VIA N 

STR 

Dr»M{R(N)) 

STORE VIA X AND 

STXD 

D*>M(R(X)); R{X)-1 

DECREMENT 




REGISTER OPERATIONS 


INCREMENT REG N 

INC 

R(N)+T 

DECREMENT REG N 

DEC 

R(N)-1 

INCREMENT REG X 

IRX 

R(X)+1 

GET LOW REG N 

GLO 

R(N).0-»0 

PUT LOW REG N 

PLO 

D->R{N)*0 

GET HIGH REG N 

GHI 

R(N),V>D 

PUT HIGH REG N 

PHI 

D->R(N).l 


LOGIC OPERATIONS 


OR 

OR 

M(R(X)) OR D >D 

OR IMMEDIATE 

ORI 

M(R(P)) OR D->D; R(P)+1 

EXCLUSIVE OR 

XOR 

M(R(X)) XOR D-M) 

EXCLUSIVE OR IMMEDIATE 

XRI 

M(R{P)) XOR D->D; R(P)+1 



TABLE C.l 
(Continued) 
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Instruction 

Mnemonic 

Operation 

AND 

AND 

M(R(X)) AND 0>D 

AND IMMEDIATE 

ANI 

M(R(P)) AND EL>D; R(P)+1 

SHIFT RIGHT 

SHR 

SHIFT D RIGHT, LSB(D)-^DF, 
O-)-MSB(D) 

SHIFT RIGHT WITH 

SHRC 

SHIFT D RIGHT, LSB{D)-)-DF, 

CARRY 


DF*j-MSB(D) 

RING SHIFT RIGHT 

RSHR 

■ 

SHIFT LEFT 

SHL 

SHIFT D LEFT, MSB(D)->DF, 



(T)-LSB{D) 

SHIFT LEFT WITH 

SHLC 

SHIFT D LEFT, MSB(D)o-DF, 

CARRY 


DF->-LSB(D) 

RING SHIFT LEFT 

RSHL 

• 

CONTROL INSTRUCTIONS 



IDLE 

IDL 

WAIT FOR DMA OR 
INTERRUPT; M(R{0))->BUS 

NO OPERATION 

NOP 

CONTINUE 

SET P 

SEP 

N-5-P 

SET X 

SEX 

N>X 

SET Q 

SEQ 

1-j-Q 

RESET Q 

REQ 

0>Q 

SAVE 

SAV 

T->M(R(X)) 

PUSH X,P TO STACK 

MARK 

(X,P)->T; (X,P)->M(R(2)) 
THEN P-^X; R(2)-l 

RETURN 

RET 

M(R(X))+(X,P); R(X)-H 
l->-IE 

DISABLE 

DIS 

M(R(X)}->(X,P); R(X)+1 
ME 


TABLE C.l 
(Continued) 


197 


Instruction Mnemonic Operation 

BRANCH INSTRUCTIONS - SHORT BRANCH 

BR M(R(P))->R(P).0 

NBR R(P)+1 

BZ IF D=0, M{R(P))-J-R{P).0 

ELSE R(P)+1 

BNZ IF D NOT 0, H(R(P))^R{P) ,0 

ELSE R(P)+1 - 

BDF IF DF=1. M{R(P))->R(P).0 

BPZ ELSE R(P)+1 

BGE 

BNF IF DF=0, M(R(P))->(P).0 

BM ELSE R(P)+1 

BL 

BQ IF Q=l, M{R{P))^>-R(P).0 

ELSE R(P)+1 

SHORT BRANCH IF Q=0 BNQ IF Q=0, M(R(P))->-R{P).0 

ELSE R(P)+1 

SHORT BRANCH IF EF1=1 B1 IF EF1=1, M(R(P)) >R(P).0 

ELSE R(P)+1 

SHORT BRANCH IF EF1=0 BNl IF EF1=0, M{R(P))-vR(P).0 

ELSE R(P)+1 

SHORT BRANCH IF EF2=1 B2 IF EF2-1 , M(R(P)) >R(P).0 

ELSE R(P)+1 

BN2 IF EF2=0» M(R(P)) >R{P) .0 

ELSE R(P)+1 


•I 


SHORT BRANCH IF 
D NOT 0 

SHORT BRANCH IF DF=1 
SHORT BRANCH IF POS 
OR ZERO 

SHORT BRANCH IF EQUAL 
OR GREATER 
SHORT BRANCH IF DF=0 
SHORT BRANCH IF MINUS 
SHORT BRANCH IF LESS 
SHORT BRANCH IF 0=1 


SHORT BRANCH 
NO SHORT BRANCH 
(SEE SKP) 

SHORT BRANCH IF D=0 


SHORT BRANCH IF EF2=0 


TABLE C.1 
(Continued) 
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Instruction 

Mnemoni c 

Operati on 

SHORT BRANCH IF EF3=1 

B3 

IF EF3=1, M(R(P))->R{P).0 
ELSE R(P)+1 

SHORT BRANCH IF EF3=0 

BN3 

IF EF3=0, M(R{P))-^R{P).Q 
ELSE R(P)+1 

SHORT BRANCH IF EF4=1 

B4 

IF EF4=1, M(R(P))->R(P).0 
ELSE R(P)+1 

SHORT BRANCH IF EF4=0 

BN4 

IF EF4=0, M{R(P))-»-R(P).0 
ELSE R(P)+1 


INPUT - OUTPUT BYTE TRANSFER 


OUTPUT 1 

OUT 1 

M(R(X))^BUSi R(X)+n N LINES=1 

OUTPUT 2 

OUT 2 

M{R(X))-5-BUS; R(X)+1; N LINES=2 

OUTPUT 3 

OUT 3 

M(R{X))^>BUS; R(X)+1; N LINES=3 

OUTPUT 4 

OUT 4 

M(R(X))-)-BUS; R(X)+1; N LINES=4 

OUTPUT 5 

OUT 5 

M(R{X))->BUS; R(X)+1; N LINES=5 

OUTPUT 6 

OUT 6 

M(R(X))-vBUS; R(X)+1 ; N LINES=6 

OUTPUT 7 

OUT 7 

M(R(X))->BUS; R{XHl; N LINES=7 

INPUT 1 

INP 1 

BUS-)-M{R(X)); BUS-vD; N LINES=1 

INPUT 2 

INP 2 

BUSj-M(R(X)); BUS^>D; N LINES=2 

INPUT 3 

INP 3 

BUS+M{R(X)); BUS^^D; N LINES=3 

INPUT 4 

INP 4 

BUS>M(R(X)); BUS^B; N LINES=4 

INPUT 5 

INP 5 

BUS->-M(R(X)); BUSvD; N LINES=5 

INPUT 6 

INP 6 

BUS-^M(R(X)); BUS>D‘, N LINES=6 

INPUT 7 

INP 7 

BUS»-M(R(X)); BUS->D; N LINES=7 



TABLE C.l 
(Continued) 


199 


Instruction 

Mnemonic 

Operation 

BRANCH INSTRUCTIONS - 

LONG BRANCH 


LONG BRANCH 

LBR 

M(R(P))->R(P).l 

M(R(P)+l)->-R(P).0 

NO LONG BRANCH 

NLBR 

R(P)+2 

(SEE LSKP) 

LONG BRANCH IF 0=0 

LBZ 

IF 0=0, M(R(P))->R(P).l 

M(R(P))+1 )->R(P).0 
ELSE R(P)+2 

LONG BRANCH IF D NOT 0 

LBNZ 

IF 00 NOT 0, M(R(P))'>R{P).l 

M(R(P)+1)->R(P).0 

ELSE R(P)+2 

LONG BRANCH IF DF=1 

LBDF 

IF DF=1, H(R(P))-»-R(P).l 

H(R(P)*i*1 )-»-R{P).0 
ELSE R(P)+2 

LONG BRANCH IF DF=0 

LBNF 

IF DF=0, M(R(P))‘>R(P).1 

M(R(P)+1 )-»-R(P),0 
ELSE R(P)+2 

LONG BRANCH IF Q=1 

LBQ 

IF q=l, M(R(P))'>R(P),1 

M(R(P)+1 )-»-R(P).0 
ELSE R(P)+2 

LONG BRANCH IF Q=0 

LBNQ 

IF Q=0, M(R(P))->R(P).l 

M(R(P)+1)»-R(P).0 
ELSE R(P)+2 

SKIP INSTRUCTIONS 

SHORT SKIP 

SKP 

R(P)+1 

(SEE NBR) 

LONG SKIP 

LSKP 

R(P)+2 


(SEE NLBR) 


TABLE C.l 
(Contimied) 
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Instruction 

Mnemonic 

Operation 

LONG SKIP IF D=0 

LSZ 

IF D=0, R(P)+2 
ELSE CONTINUE 

LONG SKIP IF D NOT 0 

LSNZ 

IF D NOT 0, R(P)+2 
ELSE CONTINUE 

LONG SKIP IF DF=1 

LSDF 

IF DF=1, R(P)+2 
ELSE CONTINUE 

LONG SKIP IF DF=0 

LSNF 

IF DF=0, R(P)+2 



ELSE CONTINUE 

LONG SKIP IF Q=1 

LSQ 

IF Q=1 , R(P)+2 
ELSE CONTINUE 

LONG SKIP IF Q=0 

LSNQ 

IF Q=0, R(P)+2 
ELSE CONTINUE 

LONG SKIP IF IE=1 

LSIE 

IF IE=1 , R(P)+2 
ELSE CONTINUE 

ARITHMETIC OPERATIONS 

ADD 

ADD 

H{R(X))+D->DF, D 

ADD IMMEDIATE 

ADI 

M(R(P))+D-J-DF, D; R{P)+1 

ADD WITH CARRY 

ADC 

M(R(X))+D+DF-^DF, D 

ADD WITH CARRY, 

ADCI 

M(R(P))+D+DF-^DF, D 

IMMEDIATE 


R(P)+1 

SUBTRACT D 

SD 

M(R(X))-D->DF, D 

SUBTRACT D IMMEDIATE 

SDI 

M(R(P))-D->DF, D; R{P)+1 

SUBTRACT D WITH BORROW 

SDB 

M(R{X))-D-(NOT DFVvDF. D 

SUBTRACT n WITH BORROW, 

SDBI 

M(R(P))-D-(NOT DFHF, D; 

IMMEDIATE 


R(P)-IT 

SUBTRACT MEMORY 

SM 

D-M(R(X))->DF, D 



TABLE C.l 
( Conti nuGd) 
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Instruct! on 

Mnemonic 

Operation 

SUBTRACT MEMORY, 

SMI 

D-M(R{P)KDF, D; 

IMMEDIATE 


R(P)+1 

SUBTRACT MEMORY WITH 

SMB 

D-M(R(X))-(WOT DFhDF, D 

BORROW 



SUBTRACT MEMORY WITH 

SMBI 

D-M{R(P))-(NOT DF)->DF, D 

BORROW, IMMEDIATE 


R(P)+1 
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R(X). Since the contents of register X can be changed by a one-byte 
set X instruction, data can be output efficiently from calling programs, 
data storage areas in memory j and from immediate data bytes. This feature 
is utilized extensively in the tse computers control programs v/hich out- 
put both immediate data and data obtained from the calling program. i 

In some applications a subroutine is called from several distinct 
programs which may use different program counters. The set P subroutine 
call technique is unsatisfactory for these applications because the sub- 
routine cannot easily determine which register was the calling program 
counter. Two alternate subroutine call procedures are provided to 
simplify this type of subroutine call. The first procedure is a MARK 
subroutine technique [19] in which the fiARK instruction is used to 
save the current value of X and P in a software stack. This method 
permits the use of nested subroutines where the nesting order varies 
dynamically. The second procedure is the standard call and return 
technique [19] which uses two linking subroutines to control the call 
and return processes. The standard call and return technique is the 
most advanced call and return method. Advantages of the standard call 
and return technique include unlimited subroutine nesting capability and 
maximum flexibility in storing scratch-pad registers. In the standard 
subroutine call and return technique, registers four and five are 
assumed to point to the linking call subroutine and the linking return 
subroutine, respectively. A call is initiated by setting P to four. 

The address of the called subroutine is specified by two data bytes 
which should follow the set P instruction. Returns are initiated by 
setting P to five. Except during the actual call and return operations, 

.ODUCIBILlTy Oi XvIS 
PAGE IS POOE 
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both main programs and subroutines which utilize the standard call and 
return technique execute with register three as the program counter. 
All three subroutine call procedures are used In the tse computer 
programs to maximize their efficiency. 

The standard subroutine call and return technique requires some 
of the scratch-pad registers to be dedicated to particular functions. 
The tse computer control programs also assign particular functions to 
certain scratch-pad registers. Table C.2 lists the functions assigned 
to the COSMAC registers in the tse computer control unit application. 


ZQA 


TABLE C.2 

COSMAC CDP1802 REGISTER ASSIGNMENTS 


Reg1 ster 


Function 


R(0) 

R(l) 

R(2) 

R(3) 

R(4) 

R(B) 

R{6) 

R(7) 


R(8) 

R(9) 

R(A) 

R(B) 

R(c) 

R(D) 

R(E) 


R(F) 


DMA Address Register 
Interrupt Service Program Counter 
Stack Pointer 
Main Program Counter 
Dedicated Program Counter for the Call 
Subroutine 

Dedicated Program Counter for the Return 
Subrouti ne 

Pointer to the Return Location and Arguments 
Passed to the Called Subroutine 
Dedicated Program Counter for the Long Delay 
Subroutine with R(3) as the Calling Program 
Counter 

Scratch-Pad Register used by the Long 
Delay Subroutine 

Dedicated Program Counter for the General 
ALU Operations Control Program 
Dedicated Program Counter for the Index 
Recognition Control Program 
Unassigned 

Dedicated Program Counter for the Compare 
Operations Control Program 
Dedicated Program Counter for the Input 
Control Program 

Dedicated Program Counter for the Long 
Delay Subroutine for Variable Calling 
Program Counters 
Unassigned 
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CONTROL MICROPROGRAMS FOR THE TSE COMPUTER 

This appendix lists four microprograms which control execution 
of the basic tse instruction set of the Golay transform tse computer. 
Two long delay subroutines are also listed. The total length of 
the microprograms presented in this appendix is 238 bytes. 


TSE computer general ALU OPERATIONS CONTROL PROGRAM 


I ALL INSTRUCTIONS EXCEPT TCLRI, TIDA, TCNT. TTEST# 



1 

TCMP, TCPI, TBZ. 


i 


R9 FUNCTIONS A: 

OIOOOO 

EXALU: 

SEP 

R3 

OlOOOO 

323 



010001 

ALUOP; 

OUT 

1 

OlOOOl 

141 



010002 


OUT 

2 

010002 

142 



010003 


OUT 

6 

010003 

146 



010004 


MLDLY 

2247. 

010004 

171 



OlOOOS 

336 



010006 

OlO 



010007 

307 



010010 

042 



010011 

DEST: 

SEX 

R9 

01001 1 

351 



010012 


OUT 

7 

010012 

147 



010013 


DB 

<-^B0000001 1> 

01001.3 

If 

003 




TSNZ» TLBZ, TLBNZ AND I IN. 

3 THE PROGRAM COUNTER 

; OUTPUT MASK CONTROL BYTE 
i OUTPUT REGISTER OUTPUT CONTROL BYTE 
i OUTPUT PORT h CONTROL BYTE 
i DELAY 9 GATE DELAYS 

} PREPARE TO OUTPUT IMMEDIATE DATA 
} TURN AND GATE AT OUTPUT OF ALU ON 
} AND TURN ALU OUTPUT LATCH FEEDBACK 
j PATI-! ON 


Figure D.l Tse computer general ALU operations control program 


010014 

010014 

171 

MLDLY 

997 . 

010015 

336 



010016 

003 



010017 

■345 



010020 

010021 

042 

OUT 

7 

010021 

147 



010022 

010022 

001 

DB 

<-^B00000001> 

010023 

010023 

171 

MLDLY 

496. 

010024 

336 



010025 

001 



010026 

360 



010027 

010030 

042 « 

OUT 

1 

010030 

141 



0I0C31 

010031 

000 

DB 

OBOOOOOOOO 

010032 

010032 

142 

OUT 

2 

010033 

010033 

000 

DB 

OBOOOOOOOO 

010034 

010034 

146 

OUT 

6 

010035 

010035 

000 

DB 

C''B00000000> 


} DELAY 4 GATE DELAYS 

} TURN AND GATE AT OUTPUT OF 
} ALU OFF 

i DELAY TWO GATE DELAYS 

; TURN ALU MASK ROM-'S OFF 

i TURN REGISTER OUTPUTS AND REMAINING 
1 ALU MASKS OFF 
I TURN P61 OFF 


Figure D-1 (Continued) 


ro 

o 


010036 



SEX 

R3 

} 

PREPARE TO OUTPUT DATA FROM MAIN 

010036 

343 




f 

PROGRAM 

010037 



OUT 


i 

TURN THE REFORMATTER AT THE INPUT 

010037 

143 




i 

1 

i 

i 

OF THE SELECTED LATCH ON AND TURN 
THE FEEDBACK PATH FJEFORMATTER OF THAT 
LATCH OFF IF AN OR OPERATION IS 
NOT REQUIRED 

010040 

010040 

171 


MLDLY 

497. 

1 
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Figure D.6 Tse computer Input control program. 
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APPENDIX E 

RT-n MACRO ASSEMBLER 

This appsndix summarizes the features of the RT-11 Macro assembler 
[20] that are utilized in the tse computer cross-assembler, A detailed 
explanation of the Macro assembler can be found in [20]. 

The RT-11 Macro assembler offers three features that are essential 
to the tse computer cross-assembler. First, Macro permits user defined 
macros which allow new instructions to be defined. Second, Macro 
includes numerous conditional assembly directives v;hich simplify 
the generation of special operational codes, and, third. Macro provides 
listing control directives v;hich can be used to enhance the readability 
of assembled tse computer programs. 

Macro accepts a source program v/ith up to four fields. The general 
format of a source statement is 

label: operator operand(s) ; comments . 

The label and comment fields are optional. Either the operator or the 
operand field may be omitted depending upon the contents of the other. 
When more than one operand appears in the operand field, the operands 
are separated by one of the legal separating characters defined in Table 
E.l. The legal character set for source statements includes the letters 
A through 2, the digits.O through 9, and the special characters defined 
in Table F.2. 

Some of the special characters listed in Table E.2 are used as 
operator characters which specify unary or binary operations on the 
given operands. Tables E.3 and E.4 define the legal unary and binary 

UClBlLiTY OF THE 
. . ’ M, PAGE IS POOR 
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TABLE E.l 

LEGAL SEPARATING CHARACTERS [20] 


Character 

Definition 

Usage 

space 

one or more spaces 
and/or tabs 

A space is a legal separator 
only for argument operands. 
Spaces v/ithin expressions are 
ignored. 


comma 

A comma is a legal separator 
for both expressions and 
argument operands. 


paired angle brackets 

Paired angle brackets are used 
to enclose an argument, 
particularly when that 
argument contains separating 
characters. Paired angle 
brackets may be used anywhere 
in a program to enclose an 
expression for treatment as a 
term, (The angle bracket 
construction should be used 
when the argument contains 
unary operators). 

i\..A 

Up arrow construction This construction is 

where the up arrow equivalent in function to the 

character is followed paired angle brackets and 

by an argument is generally used only 

bracketed by any where the argument contains angl 

paired printing brackets. 

characters. 
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TABLE E.2 

SPECIAL CHARACTERS [20] 


Character Designation function 


carriage return 
line feed 
form feed 
vertical tab 


• 

• 

colon 

% 

equal sign 
percent sign 

tab 


space 

if 

number sign 

Q 

at sign 

{ 

left parenthesis 

) 

right parenthesis 

9 

comma 

* 

9 

semicolon 

< 

left angle bracket 

> 

right angle bracket 


plus sign 

- 

minus sign 

i: 

asteri sk 

/ 

slash 

& 

ampersand 

1 

exclamation point 

n 

double quote 

1 

single quote 

+ 

up arrow 

\ 

backslash 


formatting character 
source statement terminators 


label terminator 
direct assignment indicator 
register term indicator 
item or field terminator 
item or field terminator 
immediate expression indicator 
deferred addressing indicator 
initial register indicator 
terminal register indicator 
operand field separator 
comment field indicator 
initial argument or expression 
indicator 

terminal argument or expression 
indicator 

arithmetic addition operator or 
auto increment indicator 
arithmetic subtraction operator 
or auto decrement indicator 
arithmetic multiplication 
operator 

arithmetic division operator 
logical AND operator 
logical inclusive OR operator 
double ASCII character indicator 
single ASCII character indicator 
universal unary operator, 
argument indicator 
macro numeric argument indicator 


REPaODUCmiLXT:' Oi' Till:: 
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TABLE E.3 


OPERATOR CHARACTERS 

Unary 

Operator Explanation 

+ plus sign +A 

minus -A 

t up arrov/, universal tF3,0 

unary operator 

+C24 

+D127 

t034 

tBiioooin 
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Exampl e 


(positive value of A, 
equivalent to A) 

(negative* 2‘s complement 
value of A) 

(interprets 3.0 as a 
l-v;ord floating-point 
number) 


(interprets the one's 
complement of the binary 
representation of 24(8)) 

(interprets 127 as a decimal 
number) 

(interprets 34 as an octal 
number) 

(interprets 11000111 as a 
binary value) 


I 

I 


ji 
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TABIE E.4 

LEGAL BINARY OPERATORS [20] 


Binary 

Operator 

Explanation 


Example 

"t- 

addition 

A+B 


- 

subtraction 

A-B 


* 

multiplication 

A*B 

(16-bit product retiirned) 

/ 

division 

A/B 

(16-bit quotient returned) 


logical AND 

A&B 


1 

logical inclusive OR 

A!B 
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operators, respectively. Note the +0 and constructions v/hich are 
used extensively in the tse computer macros to indicate whether a 
number is in the octal or binary radix. Operands can be numbers or 
previously defined symbols. The symbols are usually defined by direct 
assignment statements which have the general format 

symbol = expression. 

The tse mask symbols, tse register symbols, and COSMAC register symbols 
are all defined by direct assignment statements. Their decimal values 
are given in Table 7.3, page 157. 

In some instructions and macro definitions, the current value 
of the assembly location counter must be known. A special symbol, the 
period, is used to represent the current value of the assembly location 
counter. The period can be used in any expression in which the other 
defined symbols are legal. The tse computer cross-assembler uses 
the current value of the assembly location counter to check for illegal 
attempts to branch across page boundaries using short branch instructions. 
When an illegal branch is detected, a .ERROR directive is used to output 
an error message to the list file as a v/arning to the programmer. Error 
messages are also printed out v/hen the programmer attempts to use an 
illegal input/output port or register specification. 

The RT-11 Macro assembler provides several types of assembler 
directives which occupy"' the operator field of a Macro source line 
and cause the assembler to perform special processing operations. 

Listing control, data storage, terminating, conditional assembly, and 
macro directives are all used in the tse computer cross-assembler. 
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The listing control directives .LIST and .NLIST are used to 
control the contents of the list file created by the assembly process. 
Macro utilizes a listing level count to determine whether or not a 
particular line of source code should be listed. When used without 
an argument, the .LIST and .NLIST statements cause the listing level 
count to be incremented or decremented, respectively. Listing is 
suppressed whenever the listing level count is negative. 

The listing control directives can also be used with arguments. 

In that case, the listing level count is not affected,- but the listing 
mode is overridden in a manner specified by the argument. The most 
commonly used listing directive arguments are shown in Table E.5. 

Listing directives are used extensively in the tse computer cross- 
assembler to prevent macro expansions from listing. This improves the 
readability of the assembled applications programs. 

Since the POP 11/40 is a 16-bit minicomputers the RT-11 Macro 
assembler normally works with 16-bit numbers. The COSMAC microprocessor, 
however, is an eight-bit machine that requires eight-bit source code and 
data bytes. A data storage directive, .BYTE, allows the Macro assembler 
to produce object files v/hich are suitable for the COSMAC control unit 
of the tse computer. The .BYTE directive truncates specified arguments 
to eight bits. The argument can be a number or any legal expression 
whose 16-bit value has a high byte that contains either all zeros or all 
ones. 

One teminating directive is used in the tse computer cross-assembler 
to indicate the physical end of a source program. This .END directive 
can have an optional argument that indicates the entry point of the 
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TABLE E.5 

SOME ALLOWABLE LISTING DIRECTIVE ARGUMENTS [20] 




Argument Default 

Function 


SEQ 

list 

Controls the listing of source line 
sequence numbers. 

LOC 

list 

Controls the listing of the location 
counter (this field v/ould not normally 
be suppressed). 

BIN 

list 

Controls the listing of generated binary 
code (supersedes BEX). 

SRC 

list 

Controls the listing of the source code. 

COM 

list 

Controls the listing of comments. This 
is a subset of the SRC argument and can 
be used to reduce listing time and/or 
space where comments are unnecessary. 

SYM 

list 

Controls the listing of the symbol table 
for the assembly. 
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program. Often, this feature is used to provide automatic start-up 
of programs after they are loaded into the computer. 

Conditional assembly directives are one of the most important 

Macro assembler features. These directives provide the programmer 

with the capability to conditionally include or ignore blocks of 

source code during the assembly process. The general form of a 

conditional block of Macro code is 

.IF conditi on, argument (s) ; Start of Conditional Block 
. ; Statements in the Range of 

. ; the Conditional Block 

.ENDC ; End of Conditional Block 

where the condition which must be met for the block to be included 

in the assembly is one of those given in Table E.6. 

There are three subconditional directives {Table E.7) which can 
be placed within conditional blocks to indicate that an alternate 
section of code should be assembled when the main condition is not 
mt. Alternately, these subconditionals can be used to indicate the 
unconditional assembly of a section of code v^ithin the conditional block. 
The value of the condition, found upon entering the conditional block 
of code , is the implied argument of the subconditional statements. 

One line conditional blocks can be written using an imnediate 
conditional directive of the form 

,IIF condition, argument, statement. 

The allowable conditions and arguments are the same as those defined 
earlier. Note that a .ENDC statement is not required for immediate 
conditionals. 






PAGE 
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TABLE E.6 

ALLOWABLE CONDITIONS [20] 


Conditions 


Positive 

Copinlement 

Arguinents 

Assemble Block If 

EQ 

NE 

expression 

expression=0 (or f 0) 

GT 

LE 

expression 

expression>0 (or < 0) 

LT 

GE 

expression 

express! on<0 (or ^ 0) 

DF 

NDF 

symbolic 

argument 

symbol is defined 
(or undefined) 

B 

NB 

macro-type 

argument 

argument is blank 
(or non blank) 

IDN 

DIF 

two macro-type 
arguments separated 
by a comma 

arguments identical 
(or different) 

Z 

NZ 

expression 

same as EQ/NE 

G 


expression 

same as GT/LE 

L 


expression 

same as LT/GE 


if 
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TABLE E.7 

SUBCONDITIONAL DIRECTIVES [20] 




Subconditional 

Function 


.IFF The code follov/ing this statement up to the next 
subconditional or end of the conditional block is 
included in the program if the value of the 
condition tested upon entering the conditional 
block is false. 

.IFT The code following this statement up to the next 

subconditional or end of the conditional block is 
included in the program if the value of the 
condition tested upon entering the conditional 
block is true. 

,IFTF The code following this statement up to the next 
subconditional or the end of the conditional block 
is included in the program regardless of the value 
of the condition tested upon entering the conditional 
block. 
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I 

Macro directives are the final type of assembler directives 
utilized by the tse computer cross-assembler. A .MACRO statement of 
the form 

.MACRO name, dummy argument list 

serves as the first statement of each macro definition. The name of 
the macro can be any legal symbol. Similarly, any required arguments 
are represented by legal symbols in the dummy argument list. These 
symbols can be used outside the body of the macro definition vnth no 
conflicts of definition. A comment field can be included after the 
dummy argument list. 

The last statement in each macro defintiion must be a .ENDM 
directive. The .ENDM directive is of the form 

.END name 

where name is an optional argument v/hich, if used, must be the 
name of the macro being terminated. Examples of correctly defined 
and terminated macros are given in Appendix F. 


i 

J 
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APPENIDX F 

SELECTED mCRO DEFINITIONS FROM THE TSE 
COMPUTER CROSS-ASSEMBLER MACRO LIBRARY 

Figure F.1 is a listing of some representative macro definitions 
from the tse cross-assembler macro library. The mac^'O library is 
intended to be used with Digital Equipment Corporation's RT-11 MACRO 
assembler and a PDP 11/40 minicomputer. 




fMOPaODUCIBILITY 01'' 
OtiiOiNAL PAGE IS PO* 


. MACRO SEP REG 
. ML 1ST SRC 
REG 

. BYTE <-^0320+REG> 
.LIST SRC 
. ENDM SEP 


. MACRO REQ 
. ML I ST SRC 
. BYTE -^0172 
.LIST SRC 
. ENBM REG! 


. MACRO MLDLY ADR 
. ML I ST SRC 

byte ^0171 , 

. BYTE ^0336, CADRS«'^0177400>/-^0400* ADRSf^0377 ; 
. BYTE -^0042 . 

. LIST SRC 
. EMBM MLDLY 


MARK 

SEP RE, HIADR, LADR 
DEC R2 


. MACRO DLY6 
. WLIST SRC 

. BYTE -^0304, '‘0304 i MOP NOP 
.LIST SRC 
. ENDM DLY6 


. MACRO LDI RP 
. ML 1ST SRC 
. BYTE '^0370, RP 
. LIST SRC 
. EWDM LDI 


Figure F, 1 Representative macro definitions from the tse computer 
cross-assembler macro library. 


. MACRO LDM REG 
. f4LIST SRC 
$$$ER REG 
. BYTE . REG 

. LIST SRC 
. ENDM LDN 

. MACRO GTR REG 
. ML I ST SRC 
$S$ER REG 
. BYTE C'‘01204REG> 

. LIST SRC , 

. EMDM STR 

.MACRO DB XI. X2/ X3» X4/ X5i X6 

. ML 1ST SRC 


. BYTE 

Xi 


. ir 

fJB 

X2 

. BYTE 

X2 


. IF 

NC 

X3 

. BYTE 

X3 


. IF 

WD 

X4 

. BYTE 

X4 


. IF 

NB 

X5 

. BYTE 

X5 


. IF 

f4B 

X6 

. BYTE 

X6 



. ENDC 
. EMDC 
. ENDC 
. ENDC 
. ENDC 


Figure F.1 (Continued) 


ro 

CO 

cn 


. MACRO B1 RP 
. NLIST SRC 
$$$PAG RP 

. BYTE ''0064, RP&/''0377 
. LIST SRC 
. ENDM B1 

. MACRO BZ RP 
. NLIST SRC 
SiSiSiPAG RP 

. BYTE ''0062, RP&.''0377 
. LIST SRC 
. ENDM BZ 


. MACRO LBR ADR 
. NLIST SRC 

ADR 1 = C'‘0 1 77400&. ADR>/"'0400 
ADR2=ADRge''0377 
. BYTE '•0300, ADRl, ADR2 
. LIST SRC 
. ENDM LBR 


. MACRO PLO REG 
. NLIST SRC 
$5iSER REG 
. BYTE <''0240+REG> 
.LIST SRC 
. ENDM PLO 


. MACRO DEC REG 
. NLIST SRC 
REG 

. BYTE C' 0040+REG> 

.LIST SRC 

. ENDM DEC Figure F.l (Continued) m 


. MACRO OUT REGL 
. WLIST SRC 
J^iSiSERL REGL 
. BYTE 00140+REGL> 
. LIST SRC 
. EWBM OUT 


. MACRO TCLrI 
. rJLIST SRC 
. BYTE ^0144 
. BYTE -^GOOOOOOOO 
. BYTE ^0327 
. DW 07ZL 
. LIST SRC 
. ErJBM TCLRI 


CLEAR REGISTER I 

OUT 4 
DB O 

LBLY 494. 

DELAY 2 GATE DELAYS 

THIS MACRO TURNS THE FEEDBACK PATH 

REFORMATTER OF LATCH I OFF 


.MACRO TCMR REG 1, REG2, MASK } 

. NLIST SRC 

. BYTE -^0331 i 

. IF EG. REG2-^0341 } 

. IFT } 

. BYTE MASK i 

. BYTE ■^BlOl 10001 } 

. IFF i 

. BYTE <MASK*16. > i 

R0=<REG2&'‘0017> i 

.BYTE C^0220!R0> i 

. ENDC 

. BY I E '‘BOOOOOOOl i 

. BYTE REGl i 

. BYTE C-'0360‘REG1> J 

, LIST SRC 

. EMDM TCMR 


COMPLEMENT REGISTER 
SEP R9 

ASSEMBLE IF REG2 IS A 
MASK CONTROL 'byte 
REGISTER OUTPUT CONTROL BYTE 
ASSEMBLE IF REG2 IS NOT A ' 

MASK CONTROL BYTE 

PICK OFF BOTTOM 4 BITS OF REG 2 

REGISTER OUTPUT CONTROL BYTE 

PORT SIK CONTROL BYTE. P61 ON 
REGISTER INPUT CONTROL BYTE NUMBER 
REGISTER INPUT CONTROL BYTE NUMBER 


Figure F.l (Continued) 


r>j> 

to 

CO 





.MACRO TMOV REG1,REG2 
. MLIST SRC 
. BYTE •*'0331 
. BYTE ^BOOOOOOOO 
.IP EQ» REG2-'‘0341 
. IFT 

. BYTE '‘BO 1100001 
. BYTE •‘BOOOOOOOO 
. IFF 

R0=CREG2&'‘00 1 7> 

. BYTEC'‘020 ! R0> 

. BYTE ""BOOOOOOOl * 

. EMDC 

^ LJi t i. 

.BYTE C' 0360! REG 1> 

. LIST SRC 
. ENBM TMOV 

. MACRO TMVl REG, MASK 
. NLIST SRC 
byte '‘0331 
. BYTE MASK 
. BYTE '^EOOOOOOOO 
. BYTE ' BOOOOOOOl 
. BYTE REG 
. BYTE C' 0360 ! REGO 
. LIST SRC 
. EMBM TMVI 


MOVE REGISTER TO REGISTER 
SEP R9 

MASK CONTROL BYTE 

ASSEMBLE IF REG2 IS A 
REGISTER OUTPUT CONTROL BYTE 
PORT SIX CONTROL BYTE. P61 OFF 
ASSEMBLE IF REG2 IS NOT A 

REGISTER OUTPUT CONTROL BYTE 
PORT SIX CONTROL BYTE. P61 ON 

REGISTER INPUT CONTROL BYTE NUMBER 1 
REGISTER INPUT CONTROL BYTE NUMBER 2 


MOVE IMMEDIATE TO REGISTER 
SEP R9 

MASK CONTROL BYTE 
REGISTER OUTPUT CONTROL BYTE 
PORT SIX CONTROL BYTE. P61 ON 
REGISTER INPUT CONTROL BYTE NUMBER 1 
REGISTER INPUT CONTROL BYTE NUMBER 2 


Figure F.l (Continued) 


ro 

CjJ 


. MACRO TAMDW REGl, REG2, MASK i 

. NLIST SRC 

- BYTE -^0331 } 

.IF EQ, REG2-"'0341 

. IFT } 

. BYTE MASK } 

.BYTE -^BOllOOOOl i 

. IFF i 

. BYTE <MASKS{''0007> i 

RO=CREG2S!"0017> i 

-BYTE <L''O140!RO> i 

. Er4DC ^ 

. BYTE •'^BOOOOOOOO i 

. BYTE REGl i 

.BYTE <^0360! REG 1> i 

. LIST SRC 
. EWDM TAWUW 

. MACRO TOR REGr MASK i 

. MLIST SRC 

. BYTE *"0331 } 

.IF EG, REG-''0341 

. IFT } 

. BYTE MASK j 

byte '‘BOO 100001 j 

. BYTE '‘BOOOOOOOO i 

. IFF ! 

. BYTE <' 0007S!MASK> } 

. BYTE <REG&/'‘0017> s 

. BYTE '‘BOOOOOOOl } 

. ENDC 

. BYTE C^0360 ! REG> j 

. BYTE O03S0 ! REG> j 

. LIST SRC 
. ENDM TOR 


AND NOT REGISTER TO A 
SEP R9 

ASSEMBLE IF REG2 IS A 

MASK CONTROL BYTE 

REGISTER OUTPUT CONTROL BYTE 

ASSEMBLE IF REG2 IS NOT A 

MASK CONTROL BYTE 

PICK OFF LOWER FOUR BITS OF RE62 

REGISTER OUTPUT CONTROL BYTE 

PORT SIX CONTROL BYTE. P61 OFF 
REGISTER INPUT CONTROL BYTE NUMBER 1 
REGISTER INPUT CONTROL BYTE NUMBER 2 


OR REGISTER TO A 
SEP R9 

ASSEMBLE IF REG IS A 

MASK CONTROL BYTE 

REGISTER OUTPUT CONTROL BYTE 

PORT SIX CONTROL BYTE. Pi,l OFF 

ASSEMBLE IF REG IS NOT A 

MASK CONTROL BYTE 

REGISTER OUTPUT CONTROL BYTE 

PORT SIX CONTROL BYTE. P61 ON 

REGISTER INPUT CONTROL BYTE NUMBER 1 
REGISTER INPUT CONTROL BYTE NUMBER 2 


Figure F.l (Continued) 


. MACRO TCr4T REG. MASK 


. NLIST SRC 
. BYTE ^0333 
. IF EO, KEG- -^0341 


I FT 


BYTE 

MASK 

BYTE 

'^BOO 100001 

BYTE 

^BOOOOOOOO 

If^F 


BYTE 

CMASK&''OC07: 

BYTE 

CKEG&'0177> 

BYTE 

''DOOOOOOOO « 

ENDC 


LIST 

SRC 

ENBM 

7CNT 


COt-^TRACT REGISTER 
SEP RC 


ASSEMBLE IF REG IS A 
MASK COMTROL BYTE 
REGISTER OUTPUT CONTROL BYTE 
PORT SIX CONTROL BYTE. P61 OFF 
ASSEMBLE IF REG IS NOT A 
ALU MASK CONTROL BYTE 
REGISTER OUTPUT CONTROL BYTE 
FORT SIX CONTROL BYTE. P61 ON 


. MACRO TOUT REG 
. NLIST SRC 


. BYTE 

^0331 

. BYTE 

^BOOOOOOOO 

R0=CREG&''0017> 

. BYTE 

<0020 ! RG> 

. BYTE 

^BOOOOOOOl 

. BYTE 

'BOllllCOO 

. BYTE 

'•Bill 11000 

. LIST 

SFiC 

. ENBM 

TOUT 


j OUTPUT TSE 
i SEP R9 

i MASK CONTROL BYTE 

i REGISTER OUTPUT CONTROL BYTE 
i PORT SIX CONTROL BYTE. PGl ON 
i R'EGISTER O INPUT CONTROL BYTE NUMBER 
} REGISTER O INPUT CONTROL BYTE NUMBER 


Figure F.l (Continued) 


r\5 





7?.EPR0DUCIBILITY op t 
OEISHNAL PAGE IS POr> 


. MACRO TIDA X 
. NLIST SRC 
. BYTE ''0332 
. IF E8, X 
. BYTE •''0001 
. BYTE ''0277 
. ENBC 

. IF E8, X--'0006 
. BYTE •''OOOl 
. BYTE '*0200 
. EWDC 

. IF EQ. X-''0007 , 

. BYTE ''0002 
. BYTE ''02S2, •''0225 
. ENDC 

. IF OT, X-''0013 
. BYTE "'0003 
.IF E8, X-''0014 
. BYTE ''0266, "*0255, ''0233 

. EKDC 

. IF E8, X-''0013 
. BYTE ''021 1 , ''0222, ''0244 
. ENDC 
. ENDC 
. IF NE, X 
. IF NE, X-''0006 
. IF NE, X-^0007 
. IF LE, X-''0013 
. BYTE ''0006 
. IF EQ, X-'OOOl 
. BYTE ''0276, •''0275, *'‘0273 
. BYTE ''0267, ''0257, ''0237 
. ENDC 


} IDENTIFY BASIS POINTS WITH A 
} SURROUND OF INDEX X 
5 SEP RA 

j ASSEMBLE IF INDEX IS ZERO 

i 1 

} INDEX ZERO CONTROL BYTE 

i ASSEMBLE IF INDEX IS SIX 
} 1 

} INDEX SIX CONTROL BYTE 

i ASSEMBLE IF INDEX IS SEVEN 
I 2 

i INDEX SEVEN CONTROL BYTES 

! ASSEMBLE IF INDEX IS TWELVE OR THIRTEEN 
! 3 

} ASSEMBLE IF INDEX IS TWELVE 
i INDEX TWELVE CONTROL BYTES 

i ASSEMBLE IF INBEX THIRTEEN 
i INDEX THIRTEEN CONTROL BYTES 


i 

ASSEMBLE 

IF 

INDEX 

IS 

NOT 

ZERO 

i 

ASSEMBLE 

IF 

INDEX 

IS 

NOT 

SIX 

} 

ASSEMBLE 

IF 

INDEX 

IS 

NOT 

SEVEN 

i 

ASSEMBLE 

IF 

INDEX 

IS 

NOT 

TWELVE OR THIRTEEN 


6 

ASSEMBLE IF INDEX IS ONE 
INDEX ONE CONTROL BYTES 


Figure F.l (Continued) 


ro 

ro 


. IF EQ» X-''O002 


. BYTE ■'*0274, -'0271, 

"0263 

. BYTE -'0247, -'0217, 

"0236 

. ENBC 


.IF E0, X-'OOOS 


. BYTE -'0270, -'0261, 

"0243 

. BYTE -'0207, -'0216, 

"0234 

. EMDC 


IF EQ, X- '0004 


BYTE "0260, "0251, 

"0203 

. BYTE "0206, "0215, 

"0230 

. ENDC * 


. IF EQ, X "0005 


. BYTE "0240, "0201, 

"0202 

. BYTE "0204, "0210, 

"0220 

. ENDC 


. IF EQ, X--010 


. BYTE "0254, "0231 

"0262 

. BYTE "0245, "0213, 

"0226 

. ENBC 


.IF EQ. X'-"0011 


. BYTE "0264, "0251. 

"0223 

. BYTE "0236, "0215, 

"0232 

. ENDC 


. IF EQ, X-"0O12 


. BYTE "0242, "0250, 

"0221 

. BYTE "0224. '0205, 

"0212 


. EWDC 

.IF FQ. X-'013 
t:YTE ''0253 1 ''0227, ''0256 

t-YTE '0235, -0272, -'02^-5 

. EWDO 
. EKDC 


5 ASSEMBLE IF INDEX IS TWO 
j INDEX TWO CONTROL BYTES 


} ASSEMBLE IF INDEX IS THREE 
i INDEX THREE CONTROL BYTES 


j ASSEMBLE IF INDEX 13 FOUR 
} INDEX FOUR CONTROL BYTES 


} assemble if index is i-IVE 

} INDEX FIVE CONTROL BYTES 


, ASSEMBLE IF INDEX IS EIGHT 
i INDEX EIGHT CONTROL BYTES 


> ASSEMBLE IF INDEX IS NINE 
i INDEX NINE CONTROL BYTES 


I ASSEMBLE IF INDEX IS TEN 
i INDEX TEN CONTROL BYTES 


i ASSEMBLE IF INDEX IS ELEVEN 
i INDEX ELEVEN CONTROL BYTES 


Figure F.l (Continued) 


PC 

CO 



. EP4DC 
. EMDC 
. EP4DC 
. LIST SRC 
. EWDM TIDA 


. MACRO TCMP REG, MASK 
. ML 1ST SRC 
. BYTE ''0333 
MO=<MASK*16. > 

M1=CM0 ' MASfO 
. BYTE CM18,''0167> , 
R0=CREGS!''0I7> 

. BYTE <R0!^'0201> 

. BYTE ''BOOOOOOOl 
•LIST SRC 
. EMDM TCMP 


COMPARE REGISTER 
SEP RC 

Si-IIFT LOWER FOUR BITS OF MASK BYTE INTO 

UPPER FOUR BITS AND COPY BACK INTO LOWER FOUR BITS ALSO 

MASK CONTROL BYTE 

PICK OFF BOTTOM 4 BITS OF REG 

RECaSTER OUTPUT CONTROL BYTE 

PORT SIX CONTROL BYTE. P61 ON 


} INPUT TSE 
I SEP RD 


. MACRO TBNZ RP ; TSE BRANCH ON RF NOT EQUAL TO ZERO 

. NLIST SRC 
SIS$PAG RP 

. BYTE ^0217 i GLO RF 

. BYTE '‘0072, RP2«'^0377 i BNZ RP 

. LIST SRC 
. ENDM TBNZ 


. MACRO TIN 
. NLIST SRC 
. BYTE •''0.335 
. LIST SRC 
. ENDM TIN 


ro 


Figure F.1 (Continued) 



. MACRO $$ifcPAG RPT 
. MLIST SRC 

ADR 1 =<-^0 1 77400?(RPT>/'^0400 
ADR2=<^0177400?<. >/‘^0400 

.IIP ME. CADRl-ADRl . ERRORi ILLEGAL BRANCH OVER PAGE BOUNDARY 

. ENDM $$$PAG 


. MACRO «$SER RECT 
. IIP GE. REGT. 

.IIP LE, CREGT-1^. >, 3i$$2=$«$2-l 

. IIP NE. «S$2, . ERRORj ILLEGAL REGISTER SPECIFICATION 

. ENDM $$$ER 

. MACRO '4;$UERL REGLT 
3*$5i2=2 

. IIP GE. REGLT, ^3i3;2=$i&$2- 1 
. IIP LE, <REGLT-7. >, $S5$2=i5$$2“ 1 

. IIP NE. 4;SJ3:2, .ERRORi ILLEGAL PORT ASSIGNMENT 

. ENDM $$«ERL 


Figure F*1 (Continued) 
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APPENDIX G 


SAMPLE APPLICATIONS PROGRAMS 
FOR THE GOLAY TRANSFORM TSE COMPUTER 

Figure G.1 is a listing of a program for performing the Golay 
transform skeletonizing algorithm using the tse computer. The skele 
tonizing program is 129 bytes long and requires 607 unit gate delays 
to process a simple image. A program for performing the swelling 
algorithm is listed as Figure G.2. Each iteration of the swelling 
algorithm takes 664 unit gate delays. The sv/elling program is 147 
bytes long. 


000 i 00 


000100 

335 

000101 


000101 

331 

000102 

000 

000103 

141 

000104 

ooo 

0001 OS 

264 

000100 

364 

OOOlO/ 


000107 

144 

000110 

000 

000111 

327 

000112 

001 

000113 

356 

000114 


0001 14 

332 

0001 IS 

006 

000116 

276 

000117 

275 

000120 

27*3 

000121 

267 

000122 

257 

000123 

237 

COO 124 


000124 

332 

O0O12S 

006 

000126 

274 

000127 

271 

000130 

263 

000131 

247 

000132 

217 

000133 

236 


START: TIN 

ANOTHER: TMOV 


TCLRI 


r 


TIDA 


TIDA 


Cl A 


1 


2 


s INPUT NEXT IMAGE FOR SKELETONIZING 
i SAVE IMAGE IN REGISTER C 


} CLEAR INDEX REGISTER 


i RECOGNIZE INDEX ONE 


i RECOGNIZE INDEX TWO 


Figure G. 1 A program for performing the Golay transform 
skeletonizing algorithm. 


ro 


000134 

000134 

OOjt. 


TIDA 

3 

i RECOGNIZE INDEX THREE 

000135 

006 





000136 

270 





000137 

261 





000140 

243 





000141 

207 





000142 

216 





000143 

000144 

234 


TAWDM 

A, I, Ml 

i PEREORM GOLAY FUNCTION OPERATION 

000144 

331 





000145 

i. 

9 




o 

o 

o 

150 





000147 

000 






341 





000151 

36 1 




i FOR THE FIRST SUBFIELD 

000152 

000152 

144 


TCLRI 


> CLEAR INDEX REGISTER 

000153 

000 





000154 

327 





000155 

001 





000156 

000157 

356 


TIDA 

1 

i RECOGNIZE INDEX ONE 

000157 






000160 

006 





000161 

276 





000162 

275 





000 1 63 

273 





000164 

267 





000165 

257 





000166 

237 






Figure GJ (Continued) 


ro 

CO 


000167 
000167 
000170 
000171 
000172 
0001/3 
000174 
000 17S 
000176 
000177 
000177 
000200 
000201 
000202 
000203 
000204 
000203 
000206 
00020 / 
000207 
000210 
000211 
000212 
000213 
000214 


332 

006 

274 

271 

263 

247 

217 

236 


006 

270 

261 

207 

216 


-f 

002 

150 

000 

341 

361 


TIDA 2 


TIDA 


TAWDN A, I. M2 


000215 

000215 144 
000216 000 
000217 327 
000220 001 
000221 356 


TCLRI 


Figure G,1 


REC0Gr4IZE INDEX TWO 


} RECOGNIZE INDEX THREE 


j perform golay function operation 


} FOR the second SUBFIELD 
» CLEAR INDEX REGISTER 


(Continued) 


ro 


ODUCIBILITY OF THE 
■CSiMj PAGE IS POOR 


000222 

000222 

332 

TIDA 

1 

i RECOGNIZE 

INDEX 

ONE 

000223 

006 






000224 

276 






00022S 

275 






000226 

273 






00022/ 

267 






000230 

257 






000231 

237 






000232 

000232 

t 

332 

TIDA 

2 

i RECOGNIZE 

INDEX 

TWO 

000233 

006 






000234 

274 






000235 

271 






000236 

263 






000237 

247 






000240 

217 






000241 

236 






000242 

000242 


TIDA 

O 

c» 

s RECOGNIZE 

INDEX 

THREE 

V •4- O 

006 






000244 

270 






000245 

261 






000246 

243 






000247 

207 






0002o0 

216 






000251 

234 







Figure GJ (Continued) 


ro 

cn 

o 


000252 

000252 

-f 

JL 

TANDN 

A# I . M3 

000253 

007 



000254 

150 



000255 

000 



000256 

341 



000257 

361 



000260 

000260 


TCMP 

C, M 

000261 

104 



000262 

205 



0i00263 

001 



000264 

000264 

217 

TBNZ 

ANOTHER 

000265 

072 



000266 

101 



000267 

00026/ 

JL 

TOUT 

A 

000270 

coo 



000271 

021 



000272 

001 



000273 

170 



000274 

370 



000275 

CHECK; 

B2 

START 

000275 

065 



000276 

000277 

100 

BR 

CHECK 

000277 

060 



000300 

275 



000301 

END 




Figure G.l 


i PERFORM GOLAY FUNCTIOW OPERATION 


} FOR THE THIRD SUBFIELD 
i COMPARE ITERATION RESULT WITH IMAGE 


i SAVED IN REGISTER C 
i BRANCH TO PERFORM ANOTHER 


i ITERATION IF DIFFERENT 
} OTHERWISE OUTPUT THE SKELETON IMAGE 

! CHECfC FOR A NEW ■ INPUT IMAGE 
i WAIT IN A LOOP FOR THE NEW IMAGE. 


(Continued) 




1 


000400 


000400 


000401 

000401 

000401 

1 

000402 

000 

000403 

141 

000404 

000 

0004OU 

264 

000406 

364 

000407 

000407 

144 

000410 

000 

000411 

327 

000412 

001 

000413 

356 

000414 

000414 

332 

000415 

006 

000416 

270 

000417 

261 

000420 

243 

000421 

207 

000422 

216 

000423 

234 

000424 

000424 


000425 

C06 

000426 

260 

000427 

231 

000430 


00043 1 

206 

000432 

215 

000433 

230 


BEGIW: TIN 

CONT INUE; 

TMOV 


TCLRI 


TIDA 


TIDA 


0/ A 


4 


} INPUT IMAGE FOR SWELLING 
! SAVE IMAGE IN REGISTER C 


i CLEAR INDEX REGISTER 


i RECOGNIZE INDEX THREE 


i RECOGNIZE INDEX FOUR 


Figure G.2 A program for performing the Golay transform 
swelling algorithm. 
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Figure G-2 (Continued) 
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Figure G.2 (Continued) 
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Figure G.2 (Continued) 
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